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PEEFACE. 


In a sequence of fundamental memoirs, G. Udihy Yule, the 
eminent English statistician, has proposed certain methods of time 
series analysis which are of an essentially wider scope than the 
classical methods used in the search for periodicities. The basis 
of the new methods is a concept of flexible peiuodicity which in 
an ideal case reduces to the classical, functionally rigid periodicity. 
The importance and the broad applicability of the new ideas has 
been stressed particularly in subsequent discussion of the nature 
of business cycles. 

In the recent rapid development of the theory of probability, 
the production of A. Khintchine and A. Eolmogoboef represents 
a genuine discontinuity. A firm, axiomatic foundation has been 
obtained, for the theory; other important results belong to the 
theory of random processes, i. e. hypothetical models for the 
analysis of time series. In accordance with the great diversity of 
time series, the main types of random process are of quite different 
structure. 

In the theory of probability, the approaches of G. TJ. Yule fall 
under the heading of the stationary random process as defined and 
studied by A. Ehintchii^e. The present work might be described 
as a trial to subject the fertile methods of empirical analysis 
proposed by Yule to an exaniination and a development by the 
use of the mathematically strict tools supplied by the modern theory 
of probability. This statement, however, implies no valuation of 
the results and should be regarded rather as a tribute to my 
sources of inspiration and to the traditions of my milieu of study. 

My most sincere thanks are due to my teacher. Professor Haeald 
Cramee. His brilliant courses, distinguished by a spirit of realism 
combined with penetrating logic, have laid the foundation for my 
further work. As far as the present thesis is concerned, this is 
true not only in general but also in respect to particular parts 
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Series Analysis. 1 wish to evidence my deep gratitude to Professor 
Cramer also for the encouragement and interest shown me at all 
times, and culminating in his detailed reading of the first version 
of the manuscript. Our subsequent discussions have caused a revi- 
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To the Eoyal Swedish Academy of Sciences I want to express 
my respectful gratitude for a generous grant covering a substantial 
part of the expenses for printing and numerical calculation. 
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Mr. W. Feller and Mr. 0. Lundberg for numerous stimulating 
discussions and for having read the manuscript and corrected many 
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Introduction, 

1. Remarks on the scope of the study;'' 

Observational series which describe phenoinena changing with time 
may be roughly classified in two broad categories, viz. evolutive and 
stationary. In the former case, different sections of the time series 
are dissimilar in one or more respects. For iu stance, the sectional 
averages may be distinctly different, or some other structural 
property of the series may present variation. In the analysis of 
evolutive time series, time evidently must be alotted a fundamental 
role, e. g. as the independent variable in a trend function, or as 
an absolute scale in studying the development of a phenomenon 
from an initial state of rest. 

Stationary time series are unchanging in respect to their general 
structure. The fluctuations up and down in such a series may 
seem random or show tendencies to regularity — in any case, the 
character of the series is, on the whole, the same in different 
sections. Even without preparation, observational time series are 
frequently stationary. On the other hand, the deviations from a 
trend form a type of derived time series which is often stationary. 

In the analysis of stationary time series, time plays the secondary 
role of a passive medium. Judging from the literature on the subject, 
the analysis of stationary time series might seem equivalent to the 
search for periodicities. In the present volume, let this be said at 
once, time series analysis is taken in a much wider sense. 

Considering the classical methods of Foxjbieb, and Schuster, the 
hypothesis underlying these methods is that the time series under 
analysis might contain hidden periodicities, that is functional com- 
ponents which are periodic in the strict mathematical sense. It 
is welhknown that these methods have often been applied with 

1 — 38387. H . Wold . 



2 AliJALYSIS OF STATIONARY TIME SERIES [introd. 1 

definite success, and equally well-known that in many fields of 
scientific research they have met a severe criticism. An essential 
point in the criticism is that the idea of strict periods cannot 
possibly be realistic and adequate in certain applications. It has 
been claimed, for example, that in the theory of business cycles 
the hypothetical approach must be flexible to some extent, admit 
small changes in the periods and the amplitudes etc. A modified 
type of approach thus being ealied for, the difficulty is to find a 
precisely defined combination of the ideas of periodicity and of 
flexibility. It is evident that a strict hypothetical set-up as required 
can be reached only on the basis of the theory of probability. 

Though the above mentioned critical argument is old, it was not 
until rather recently that approaches have been suggested which 
allow for changes in the waves in the time series under analysis. 
There are two main lines of approach, both of them germinating 
from G. U. Yule. Let these be briefly outlined. 

Starting from a purely random series as given, for example, by 
dice-throwing, G. IT. Yule ((1921), (1926)) forms the differences of a 
fixed order, and finds that the series thus obtained pr-esents a tend- 
ency to regular fluctuations. E. Slutsky ((1927), (1937)) studies 
the effect of more general linear operations, and finds that under 
certain circumstances the resulting series will present sinusoidal 
waves with slowly changing amplitude and phase, waves showing 
a puzzling likeness to the cycles in economic time series. Nice 
examples of this are given by suitable moving averages of the 
primary random series. In the terminology of this study, the ap- 
proaches mentioned are special cases of the scheme of moving averages. 

The second type of approach is introduced by G. IT. Yule (1927) 
in a study on sunspot numbers. Considering the sunspot index in 
a set of equidistant time points, Yule investigates the multiple 
correlation between these observations, and approximates by the 
use of linear regression analysis each observation by a linear 
function of the next preceding ones. The scheme thus implicitly 
defined will be called the scheme oi linear autoregression. Using a 
physical intei^pretation, Yule gives a suggestive illustration of the 
new idea — a pendulum subjected to a stream of random shocks 
will be ruled by a scheme of linear autoregression. To a certain 
extent, the movement of the pendulum will bear resemblance to a 
free swinging, but the random impulses will cause a continuous 
shift in amplitude and phase. 
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Using a comprehensive term, the two schemes mentioned will be 
called schemes of linear regression. It is seen to be a common 
feature of these schemes that a random element plays a funda- 
mental, active role. This constitutes a distinct contrast to the scheme 
of hidden periodicities — as we shall call the hypothesis of strict 
periods — and makes the schemes of linear regression a priori 
plausible in several instances where the scheme of hidden period- 
icities has been criticized. 

From the viewpoint of the theory of probability, the schemes of 
linear regression are special cases of the stationary random process 
as defined and studied by A. Khintchine ((1932) — (1934)). Let us 
discuss the situation in some detail. 

Considering a phenomenon as described by an observational time 
series, let us fix arbitrarily a finite set of time points, say it) = 
(^ 1 , ^ 2 , . ., tn): In a probabilistic theory of the phenomenon, we 
must necessarily assume that the behaviour of the time series in 
the n points considered is ruled by a definite probability distribution 
in 72 dimensions. Generally speaking, this distribution may be taken 
to be defined by a distribution function, say F(f^, . tn‘, Un), 

where are real variables. For instance, considering a 

set consisting of but one time point (^j), the hypothetical function 
Fitp, Ui) will indicate the probability that the observational value 
in ti is less than or equal to Having stated this, it is clear 
that we must assume certain relations of consistency between the 
functions F which belong to different sets it)] otherwise the hy- 
pothetical set-up might contradict itself. For instance, it is evident 
that Fitp, ui) must be assumed to satisfy all relations of the type 
Fitp, ui) ==Fit^, 4; %, 00). 

We have seen that the probabilistic treatment of a time series 
requires a set of distribution functions, say {jP}, such that there is 
one function F corresponding to every finite set it) of time points, 
and that the functions F satisfy certain consistency relations. On 
the other hand, such a hypothetical set-up will give a sufficient 
basis for a formal probabilistic analysis. Any set {-P} with pro- 
perties as mentioned is called a ra^tdom process, and according to 
a fundamental theorem of A. Kolmogoroee (1933) such a set {jF} 
is equivalent to a probability distribution in an infinite number of 
dimensions. Of course, each time point corresponds to one dimen- 
sion in this distribution. 

In defining a random process, we may either choose our points 



4 AITALYSIS OF STATIONARY TIME SERIES [Introd. 1 

(^) quite freely or else restrict tliem somehow, e. g. to an uubrokeii 
sequence of equidistant time points. Again following A. Eolmo- 
GOEOPP, a random process as defined by the set {-F} is in the first 
case called continuous^ in the case of equidistance discrete. Inter- 
preting the random processes as random variables in an infinite 
number of dimensions, it is seen that the number of dimensions is 
enumerative in the case of a discrete process, and non-enumerative 
in the discrete case. 

In applying a probabilistic scheme to a statistical distribution 
— perhaps multidimensional — each individual observation is looked 
upon as a sample value belonging to the hypothetical distribution 
in one or more dimensions. In the same manner, in interpreting 
an observational time series as belonging to a random process, the 
series is regarded as a sample value of the corresponding distribu- 
tion in an infinite number of dimensions. Since a whole time 
series thus constitutes but one element in the statistical population, 
it is evident that the unconditioned random process is far too 
general to be useful in practical applications. According to the 
structure of the time series under analysis, we have to apply special 
types of the random process. 

A. Khintchine’s definition of the stationary random process runs as 
follows. Letting {t) = t^, . . ., 4) represent an arbitrary set of time 

points, and fixing arbitrarily a translation in time of this set, say {t^') == 
== {ti 4- • • -j + 0, a random process as defined by a set 

of distribution functions is called stationary if the two functions 
F belonging to the two sets W and (^•‘^) are identical. Thus, the 
probability laws assumed to rule the observational time series depend 
on time in such a way that if we replace time as measured from 
a fixed time point by a time variable measured from another time 
point, the probability laws will remain the same. In other words, 
if the development in a time series is known up to a certain time 
point, say t, the probability laws ruling the continued development 
will depend only on the behaviour of the time series up to the 
time point not on the actual value of t. It can be seen that 
the postulated relativity of time is precisely adequate in view of 
the broad class of stationary time series roughly characterized above. 

The present study is exclusively concerned with the theory and 
the applications of the discrete stationary processes. Accordingly, 
problems concerning evolutive time series fall outside the program. 
For instance, trend analysis will not be dealt with at all. Further, 
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evolutive random processes, and continuous stationary processes, 
will be touched upon only incidentally. 

As far as I know, A. Khintchine is alone in having* dealt with 
the discrete stationary process in full generality; his chief result 
is that the stationary processes are ruled by a law of great numbers. 
The present study being concerned with other aspects of the sta- 
tionary process, we shall next give a few comments on the main 
lines followed. 

Chapter I serves a double purpose. In surveying the leading 
methods in the search for periodicities, particular attention is drawn 
to the hypotheses underlying these methods. By thus pointing out 
the rather narrow basis of the methods considered, the need for 
other types of hypothetical scheme is made clear, and the analysis 
of other types of approach prepared. On the other hand, after 
this rather detailed survey, the hypothesis of hidden periodicities 
will need no further separate treatment. 

Chapter II is reserved for a general analysis of the discrete 
stationary process. Sections 13 and 14 are preparatory, and show 
that in certain respects the stationary processes may be dealt with 
in the same manner as random variables in a finite number of 
dimensions. In particular, the singular case considered in section 

14 corresponds to a multi-dimensional probability distribution which 
is entirely concentrated in a plane or some other linear sub-space. 

The discrete stationary process being extremely general, sections 

15 and 16 give a few examples of processes obtained by different 
specializations. In this way we arrive at strict definitions of the 
processes of linear regression which cover the above mentioned 
approaches suggested by G. U. Yxjle. Further, the scheme of 
hidden periodicities is obtained by means of a singular stationary 
process. Detailed illustrations of the processes thus defined are 
given through model time series, i. e. series constructed in an arti- 
ficial way on the basis of random sampling numbers. Finally, in 
section 16, the normal stationary process is defined by a straight- 
forward generalization of the normal distribution in a finite number 
of dimensions. 

In the latter part of the second chapter, the structural properties 
of the general stationary process are studied. The field being wide 
and unexplored, the analysis has been restricted to the elementary 
features. Generally speaking, only such properties have been taken 
into consideration which could be studied by the use of linear 
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operations and autocorrelation coefficients. As the autocorrelation 
coefficients correspond to the mixed moments of second order in a 
set of ordinary, one-dimensional variables, the analysis will, in 
certain respects, be parallel to the familiar theory of multiple 
correlation. 

In section 17 is shown that the autocorrelation coefficients of a 
discrete stationary process may always be interpreted as the 
JPouRiER coefficients of a non-decreasing function. This theorem 
discloses clearly that it is only in special cases that a periodogram 
can tell us something relevant about a time series. 

The linear autoregression analysis as prepared in section 18 and 
developed in section 19 is based on the idea of subjecting the 
discrete stationary process to a treatment which is parallel to a 
time series analysis by means of the methods proposed by G. U. 
Title (1927) and already referred to. The autoregression analysis 
thus corresponds to a linear regression analysis in a finite set of 
one-dimensional variables. As compared with periodogram analysis, 
the autoregression analysis is more general and efficient. For in- 
stance, considering the forecasts delivered by the different methods, 
the autoregression analysis reaches the limit beyond which we 
cannot proceed when employing only linear methods. 

Using the same tools as in section 19, the analysis of the structure 
of the discrete stationary process is, in section 20, carried further 
in a quite new dii’ection. In spite of its wide comprehensiveness, 
the general discrete stationary process is found to be of a readily 
surveyable structure. In fact, the general process is built up by 
two components which may be interpreted as generalized processes 
of hidden periodicities and of linear regression respectively. 

In point of principle, the methods used in Chapter II are of a 
scope which would admit of generalizations of the analysis in 
different directions. Using non-linear operations, it would be possible 
to perform an autoregression analysis corresponding to curvilinear 
regression analysis in the case of a finite number of variables. 
Further, the analysis might be extended to the case when the time 
series is multi-dimensional, i. e. when several properties of the 
phenomenon observed are studied simultaneously. 

The stochastical difference equations studied in the first two 
seetions of Chapter III form a generalization of the ordinary linear 
difference equations. While the solutions of the latter equations 
are ordinary functions, the solutions of the former equations are 
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discrete random processes. Considering an ordinary linear difference 
equation, its solution describes bow a certain oscillatory mecbanism 
will develop under given conditions. On tbe other hand, if a prim- 
ary stream of impulses is defined means of a random process, 
the solutions of a corresponding stochastical difference equation 
give the probability laws which will rule the oscillatory mechanism 
when subjected to the primary impulses. Of course, if the actual 
development of the mechanism is known, the corresponding series 
of impulses is readily obtained. Moreover, among the solutions of 
stochastical difference equations we find both evolutive and sta- 
tionary random processes, e. g. the process of linear autoregression 
as strictly defined in section 15. 

The latter part of Chapter III is reserved for a detailed study 
of the processes of linear regression, a chief purpose being to 
illustrate the general analysis in Chapter II, and the theory of 
stochastical difference equations. Particular attention is paid to 
the forecast situation. In contradistinction to the processes of 
hidden periodicities, the processes of linear regression contain an 
active random element which affects the efficiency of the forecast. 
As a matter of fact, in a process of linear regression, the efficiency 
is the less, the longer the interval of time forecasted over. On the 
other hand, the short time forecasts are the more efficient. In view 
of the applications this circumstance is advantageous, for, of course, 
the main interest is always focussed upon the short time forecasts. 

The analysis in Chapter III, too, suggests certain generalizations. 
Thus, nothing prevents us from defining random processes by means 
of non-linear stochastical difference equations. Moreover, multi- 
dimensional random processes may be defined on the basis of systems 
of stochastical difference relations. A few remarks along the latter 
line are given in sections 31 and 32. 

Chapter IV, finally, gives a few applications of the theoretical 
analysis to observational time series of the stationary type. Such 
a series being given, its correlogram — my term for the auto- 
correlation periodogram — is adopted as an indicator of which 
type of process to apply. Explicit applications being given of the 
processes of linear autoregression and of moving averages, general 
methods are indicated for finding suitable numerical values of the 
parameters involved. Eor instance, assuming a given time series to 
be a moving average of an unknown primary series, suitable values 
for the weights of the hypothetical moving average are derived 
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from the correlogram of the given series. Further, a general method 
is given for deriving the primary series v^hich corresponds to such 
a set of v^eights. 

The purpose of the applications is to illustrate certain general 
methods of analysis, not to supply a theory of the phenomena 
described by the time series examined. Accordingly, I make little 
account of the hypothetical schemes arrived at, and no attempts 
to test the significance of the parameters determined. On the 
contrary, warnings are repeatedly given for attaching importance 
to the numerical results of the analysis, for one reason because 
significance questions are extremely intricate in time series analysis. 
Another reason for caution is discussed in Appendix B, where it 
is pointed out that the empirical correlation coefficients used in 
time series analysis are quantitatively conditioned by the size of 
the statistical mass under observation. 


2. Principles of notation. 

The present volume being concerned with both theory and 
applications, symbols are needed for probabilistic as well as statistical 
concepts. Since the purpose of this section is to indicate the 
principles of notation, no completeness is aimed at with respect to 
the definitions of the elementary concepts considered. For these, 
reference is made to G. U. Ttjle and M. G. Kendall (1937) as to 
the theory of statistics, and to H. Cramejr (1937) as to the theory 
of probabilities. 

Generally speaking, the notations try to bring into relief the 
correspondence between theoretical and empirical concepts. Thus, 
for parallel theoretical and empirical concepts the same symbol 
will be used — in the latter case marked by a bar. As a rule, 
Greek letters will be reserved for random variables, Roman letters 
for functions, ordinary variables, and constants. 

In agreement with these general principles, random variables as 
dealt with in statistics will be denoted by etc. Considering 

such a random or statistical variable, say f, its observational values 
in a particular statistical population will be distinguished by run- 
ning indices, e. g. 

W §15 
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Irrespective of the order of the elements, an observational set 
(1) is uniquely determiued by the corresponding* function of cumul- 
ative relative frequencies, say Fiti). Thus Fin), for every real u, 
equals the relative number of statistical units with an observational value 
< u. The function Fiu) will be called the (empirical) distribution 
function of the observational set. By the use of the Stieltjes 
integral, the elementary characteristics of an empirical distribution 

(1) can be conveniently expressed in terms of Fiu), The average 
of ^ in the population considered is 

== i . 2 = f u - d Fiu). 

n -Jo, 

The central moment of order h is denoted by jxi, and reads 
= — • S — m)* == r (w — • d Fiu). 

71 Z:=l 00 

Thus, ^2 is the variance. Denoting the dispersion by D, we have 

( 2 ) 

In studying a statistical variable f, we introduce a corresponding 
hypothetical random variable Such a variable which is also 

called stochastical or aleatory, is completely characterized by its 
distribution function Fiu). By definition, Fiu) indicates the probab- 
, ility that ^ is less than or equal to u. This is expressed by the 
i relation 

* (3) Fiu)^F[^<u\ 

where P is the probability function of ^ (cf. fig. 18). 

In the analysis of a variable it is actually only the observa- 
tional values and the variable ^ as defined by the distribution 

function Fiu), which will appear in the formal developments. 
However, in the literature the observational values are also often 
supplied with hypothetical parallels, called sample values. For 
instance, the distribution of ^ is spoken of as constituted by an 
infinite population of sample values Such a terminology is often 
convenient, and is, incidentally, used also in the present study. As 
we need no notation for these sample values, the symbol will, 

^ , ■ . ■ 
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as indicated later on, be reserved for representing independent 
variables in random sampling. 

Let g{x) be a function of a real variable, and ^ a random variable 
witb. distribution function F{u), Under general conditions, 
represents a random variable with expectation given by 

(4) E[gm= 1 giu)-dFiu). 

00 

In particular, the elementary characteristics of ^ may be interpreted 
in this way. For instance, the mean (m), dispersion (Z)), and the 
central moments ig) are given by 

(5) m = F[^]=^] u- dFiu\ = ^ 2 * ' 

— 00 

(6) gk = — mf\ = J {u — mf * dF (u). 

— 00 

Any multi-dimensional random variable may be looked upon as a 
combination of one-dimensional variables. For such variables we 
shall use notations of type § = Interpreting in 

this manner, we can often use the same notations as in the one- 
dimensional case. When full information is required, e. g. concern- 
ing (3), we shall use notations such as 

F(Mi, Uh) == ^ Uhl 

In the same way, the expression (4) may be regarded as the expectation 
of a function p® of a multi-dimensional variable only that E [g (^] 
must be interpreted as a vector in the space of and that the 
integral must be extended over the space of For instance, let a 
one-dimensional random variable jp® be defined as a function of 
an /^-dimensional variable ^ with distribution function Fiu) = 
==P(^q, , . Uh). Then the distribution function of ^®, say Gp{x), 
will equal, for an arbitrarily fixed x, the expectation of a function 
gx,piu) = . . ., Uh) defined by the relations gx,p(u) = 1 

piu):^ X, nxid gx,piu)==0 2 bB p(u)> X. Thus, 

Gp (x) = E [gx,p (^] =^^gx,p (%, . ., Uh)' dF (%, . ., ujd. 

■ ' 

In particular, letting jP(%, , . Uh) be the distribution function of 
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a variable ^ and Gp(x) that of the sum p = 

— ^ ^{ 2 ) ^ where the p/s are real constants, 

we have 

(7) Gp ix) = J du„ ..,uf,F U}^, 

pW 

the integ-ration being extended over a half-space p{x) defined by 
p{x) = [piUi + • • • + ph %i-h ^ xl The present study being chiefly 
concerned with linear functions of random variables, this formula 
will frequently come into use. 

Considering a combined variable ? with distribu- 

tion function jF(%, . ., UhX and taking pk = 0 for Jc =k= and pi= 1, 
formula (7) gives the distribution function of the individual variable 
Denoting the resulting functions by Fiiu), the variables are 
called independent if, for an arbitrary set (%, . ., Uj!)^ the following 
relation is satisfied, 

F ^('2, . ., tt/j) = jPj * ^2 (^2^ * * • Fjii^U/h). 

In the case of independent variables, the expectation satisfies a 
general relation involving arbitrary functions viz. 

Putting Pi “ 1 for all i, formula (7) gives the distribution func- 
tion, say G{u)^ of the sum of our h variables If these are in- 
dependent, G(m) can be expressed by the use of the familiar com- 
position (convolution, Faltung) symbols, viz. 

(9) G iu) = F^iu) ^ F^M ^ ^ Fkiu). 

In the case of two independent variables, the convolution is given by 

G (u) = -Fi iu) ^ Fg iu) == J F^iu — x) • d Fg ix). 

QO 

A random sample containing h elements, and belonging to a random 
variable | with distribution function Fiu), may be defined as a 
sample value of an Fdimensional variable obtained by combining h 
independent variables h which have the same distribution function Fiu), 
The concept of a random sample is thus purely theoretical, and, 
corresponds to that of a statistical population. As already men- 
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tioned, we need no particular notation for the elements in a random 
sample. 

The ordinary correlation coefficient between two interdependent 
random variables, say and ^(i), will be denoted r = ^{k)). 

Let . . §(— 1), 5(0), 5(1)) ^(2), ... be a sequence of interdepen- 

dent random variables such that the characteristics 

(10) m( 5; 0-JE[5(S]; rf (5;e = ^[5(e-5(^ + 

will exist for all integral t and Ic, Let it further be assumed that 
the quantities defined by 


(11) 


lim f • S 

n-> CXI n ~T X t=n' 

— CO 

1 n 

iiW® = lim ■ S t) 

%-* CO '??' fl ~T X 
n'~^ — Qo 


will exist. Under these circumstances, the coefficients n defined by 


( 12 ) 


Tfc == rA:(5) 


y(^ 0 ) — 


will be called the autocorrelation coefficients of the sequence 5(^)* 
It should be observed that this definition holds also in the special 
case when 5(^) reduces to an ordinary function of t. 

Let (l) represent an empirical time series obtained by observations 
in the equidistant time points f=l, 2, . . n. Following* Gr. U. 
Yule (1926), the coefficients n- defined by 


(13) 

where 


n — h „ _ 


fk = ■ 


2—1 


{n -Tc)- 


m-i 


■j n — k _ 


n -- k 1 




1 ^ - 

2 f,, 


and 


n~k j+i' 




1 n—k _ _ 1 _ 


n — k 1 


n — k s+i 


2 — 


will be called serial coefficients. These obviously form an empirical 
parallel to the autocorrelation coefficients. 



CHAPTEE I. 


A survey of hypotheses and methods proposed for 
the analysis of time series. 

3. Scope and disposition. 

This chapter aims at giving a historical perspective to the 
investigations in the subsequent chapters. Within the bounds of 
the survey fall the fundamental facts concerning the principal 
theoretical schemes set up for the study of stationary time series 
in one dimension. The leading methods for fitting the considered 
schemes to observational data will also be examined. For the sake 
of concreteness some descriptive methods will be touched upon 
incidentally. The survey being concerned with a general outline 
only, reference is given to H. Bukkhaedt (1904) and K, Stijmpff 
((1927), (1937)) for completion. 

The purely functional schemes will be dealt with first among 
the theoretical models for a given time series. At the opposite 
extreme is taken that purely probabilistic scheme in which the 
series is regarded as a random sample of an aleatory variable. The 
other schemes may be looked upon as intermediate cases. Of the 
mixed schemes, the approach of hidden periodicities is treated first. 
The series is here assumed to be additively composed of indepen- 
dent functional and random elements. The last section of the survey 
deals with the scheme of moving averages, studied in certain 
special cases by Gr. U. Yule (1921) and E. Slutsky (1927), and 
with the scheme proposed by G-. TJ. Yule (1927) under the name of 
disturbed harmonics (cf. p. 2). 

The time points considered will be equidistant. The unit of 
equidistance will be taken for the time unit, a simplification which 
evidently does not involve any loss of generality. 
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4. Functional schemes. 

The functional schemes aim at a perfect functional representation 
of the empirical data. 

As a first example of functional approach, we take the hypothesis 
of a periodic function. Assuming the period to equal and 
denoting the hypothetical function by x{t), the following relation 
will be satisfied for any t 


(14) 


x{t) — x{t — /z,) =r 0. 


As t is given for integral values only, it will be sufficient to 
consider the case of an integral h. Then (14) forms an ordinary 
difference equation of order K According to the theory of difference 
equations, the solutions of (14) may be written 


(15) «(<) = m + S Ck cos + 9 Ji-) = 

= m + S ( cos ^ sin ^ Ai) , 

where Jc runs from 1 to — 1)/2 or hl2. The real parameters 

( TT TT \ 

“^2^2/ connected by the relations 


( 16 ) 






axG tg (— 


The approach now described will be termed the scheme of periodic 
functions. According to (15), the approach function may be con- 
sidered to be composed of superposed harmonics, each having an 
amplitude Ok, a phase pk, and a frequency Ik given by 

h ” * k. 

The expression (15) may be taken as the basis for various general- 
ized hypotheses. We start with the approach 

(17) x{t) = m A Z CkCo^iXkt A (pk)=^ 

i'==i ■ , 


= m -f S (Aa; cos /l/c t + Bk sin Xk 6, 
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which will be the scheme of superposed harmonies. Alternately, 

the function (17) will be called a composed harmonic. 

Denoting* by p^ the periods of the individual harmonics in (15) 
or (17), we have 

(18) pk — 'iTtIh. 

In the scheme (15) the periods of the individual harmonics repre- 
senting x{t) are seen to be true fractions of h. In the generalized 
scheme (17) this restriction will not be laid down on the individual 
periods as given by (18). On the other hand, since t takes on 
integral values only, it follows that for any integer n the substitu- 
tion of for will have no effect in (17). Neither will 

xii) be affected by a simultaneous substitution of for 2/^, 

and for Thus it would involve no loss of generality to 

assume that 0 < 2/c ^ it. This means that, in point of principle, 

the analysis must be restricted to periods not less than 2 time 

units. A study of shorter periods requires a reduction of the time 
unit chosen as a basis for the analysis. However, unless explicitly 
stated otherwise, we shall merely assume that 0 < X^. 

A function x(t) of type (17) belongs, as is well known, to the class 
of almost periodic functions in the sense of H. Bonn (see e. g. (1925)). 
The following property of an almost periodic function x(t) is re- 
corded for later use: An 5 > 0 being arbitrarily given, there exists 
for every number ^ 0 integer T(e, such that for 

every ^ (see H. Bonn (1925), p. 88) 

(19) \x{t T) — x{t)\< s. 

A rough description of a scheme of superposed harmonics is 
yielded by its periodogram. This has the frequency 2 > 0 for 
horizontal axis, and indicates by ordinates in Xk the corresponding 
squared amplitudes Cl. It is seen that the periodogram — which 
is met in several variants, e. g. with the periods pu — 2 jt/Xk as 
abscissae or with the amplitudes ft as ordinates — does not pay 
any regard to the constant term or to the phases appearing in the 
expression (17). Another variant is the integrated periodogram. 
This is a function, say S{X), defined by 

(20) ;S(2)=S Glli ci 

I k~l 

Thus S (2) is a step (or saltus) function which is proportionate to the 
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sum of squared amplitudes with, frequency not greater than L An 
example of periodogram and corresponding integrated periodogram 
is given in the figure below. 

In the study of light, the periodogram has an experimental parallel in the 
spectrum. The prism spreads the light waves according to their frequency Xk, and 
the individual lines in the spectrum indicate the energy of respective wave com- 
ponents. This energy is proportionate to the squared amplitude. The energy re- 
presented in an interval of the spectrum is thus proportionate to the sum of the 



Fig. 1. Ordinary periodogram ( vertical lines), and corresponding integrated 
periodogram (8{X), horizontal lines). 

ordinates in the corresponding interval of the ordinary periodogram, and propor- 
tionate to the increase of the integrated periodogram in the same interval. 

Analysis of white light produced spectra, where the lines were thin and lying- 
very close together. This fact gave rise to the idea of continuous spectra and 
periodograms, in so far as the energy belonging to an interval was thought of as 
the integral of a spectral density. A survey and a development of the mathematical 
theory used in this connexion has been given by N. Wienee (1930), Translating to 
the terminology of the present study, this theory is based upon an analysis of the 
function 

1 ^ 

(21) Q (it) ~ lim ™ * J* {x{t-h %i) — m) • {x (f) — w) d t, 

z~*co 2z —z 

where 

m== lim — f x(.t)dt 

Z-*- CO — z 

It is seen that if x(t) is given by (17), Q(u) reduces to 

1 ® 9 

- * S 0| • cos XjcU. 

2 


( 22 ) 
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Modifying’ by a constant factor the function 8{X) defined by Wienee, this reduces 
in the non-complex case to 


(23) 


S»)= 


2 f Qiw) sin Xu 

J 

7t 0 Q ( 0 ) u 


SiX) is called the integrated periodogram of x(tX for in the case of superposed 
harmonics (17), /S(yl) as giyen by (23) reduces to (20) (see.e. g. H. C. Carslaw (1930), 
p. 322). WiENEE shows that >S(yl) is always a non-decreasing function, and that there 
exist functions x{t) with continuous integrated periodograms (23). 

N. Wiener applies the generalized harmonic analysis also to functions defined 
by a random scheme. For instance, if xit) in integral intervals is 1 or — 1 with 
equal probability, and if the values taken on in different intervals are independent, 
:^(r; is mth protialiility 1 given hy 


(24) 


S(X)=—f 


2 ? 1 — 


cos^t 


^ 0 


• dn. 


5. On applied harmonic analysis. 

Let an observational time series be represented by (1). The 
problem is to choose a suitable hypothetical scheme, and to find 
for the parameters involved numerical values yielding as good fit 
as possible to the empirical data. 

In case the observational data are strictly periodic, say with 
period j?, an application of (15) by means of the Fourier analysis 
will yield a perfect fit. It will be sufficient to consider the data 
lij • *5 ip ranging over one period. However, since the formulae 
become particularly simple in case of an even period, and since we 
may take the double period for a basis if p is odd, let it be assumed 
that k ~p= 2 q. The Fourier formulae for the 2g parameters m, Ag, 
Ajb, Bjc, where Jc=l, 2, .., g—l, then read as follows (see e.g. 
H. S. Carslaw (1930), p. 325) 


(25) 


1 

p i=i 


— S ^toos — jcf, Bk 
Q i 


1 p - 

Mg = “• S cos Tct^ 


p t^x 
1 It • 

q. t^x'' q 


In the approach (17), the essential problem is to evaluate the 
frequency numbers %. The principal method is that of A. Schuster 
((1898), (1900)) which is based upon the construction of an empirical 
2 — 38387 . H, Wold, 
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periodogram, say 0^(1). The formulae required are (see e. g. K. 
Stumpff (1927), p. 103) 

= - • S (|^ — m) • cos 0 < /I < 7t, 

n t=i 

B(X)= -• sin 2(5, 0</l<^; 

n 

For X = 7 t, the factor 2 in J. and B must be omitted. 

A graph of the curve G'^H) presents characteristic maxima, the 
abscissae Xk of which are taken for the frequency numbers sought 
for. The corresponding parameters Ak = A (Xk) and Bk== B iXk) are 
obtained from (26). 

The periodogram method gives much valuable information about 
the series under investigation, but is rather inconvenient. However, 
a careful fit of (17) to observational data seems to necessitate 
tedious computations, so the labour seems due to the problem, not 
to the method. Nevertheless, many simplified and, accordingly, 
approximate methods have been proposed. One type of these is of 
interest for the sequel because it is based upon the differential or 
difference relations satisfied by a sum of harmonics. The first method 
of this kind, that of S. Oppenheim (1909), utilizes the fact that for 
arbitrary Ofc and q)k the function x{t) given by (17) satisfies a 
differential equation of 2 5:th order, 

(28) it) + Qi • (0 + • • • + gs-i • it) gs'[x if) “ m] = 0, 

with constant coefficients Qi such that the equation 

(29) • ^''-1 + • • • fi- gs-i * ^ + gs = 0 

has the roots — . . ., — X\. Identifying with x{t), and 

taking m = m, S. Oppenheim uses the identity (28) for s successive 
if-values, the gi being so far undetermined. These relations are 
considered a system determining the g/s. Inserting the resulting 
gis in (29), the solving of this equation gives, finally, the frequency 
numbers desired. 

The derivatives required for the system of identities (28) S. Op- 
PENHEiM obtains from the observational differences by the 


(26) 

(27) 
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well-lmown serial development (see e. g. E. T. Whittaker and 
G. Robinson (1926), p. 64). Tlie intricate passage from differences 
to differentials is avoided in the modification of the Oppenheim 
method given bj H. Bruns (1911), who starts from a certain identity 
between central differences satisfied by x{t\ vis. 

(30) ^ 5 + 1) + •• • + hs^i • J^x{t - 1) + 

+ hsixif) — m] = Q. 

Here the constant coefficients hi are such that the roots of the 
equation 

4 - ... 4 - ^ 4 - == 0 

are — ^|, where 

(31) == 2 sin lie! 2. 

The fact that different functional schemes may give a good fit 
gives rise to the question of which scheme should be preferred 
when analysing a given time series. This is a particular aspect of 
the general test problem which is fundamental in all practical ap- 
plications. However, the most important aspects of the test prob- 
lem belong to the theory of probability, and will be touched upon 
later. As purely functional test methods may be regarded those 
which in principle consist in an extension — extrapolation or, some- 
times, interpolation — of the observational material, and a com- 
parison with the corresponding values of the functions fitted to the 
original data. 


6. On the linear difference equation. 

A class of functions of importance in the sequel, though not as 
a scheme for time series, is formed by the solutions of linear differ- 
ence equations with constant coefficients, 

(32) {x{t)~m) + ai • [x{t— l)—m) + • ■ • + <2^ • [x{t'—h)—'iri)=0, an + 0. 

Writing (30) on the form (32), the resulting sequence at will be 
symmetrical. Now, since (30) is a special case of (32), the solu- 
tions of (32) embrace (17), and are well-known to be (see e. g. P. M. 
Marbles (1932)) 
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(33) i s 

fc=l k=l 

where stands for a polynomial of order r. While the polyno- 
mials H are of arbitrary coefficients, their orders are the same in 
all solutions. In fact, the orders are determined by the characteristic 
equation of (32), viz. 

(34) ^ ah—\ ' ^ 4* == 

= n • n + 2 ^ + gir^ = 0, 

A:=l fc=l 

where the factors in the second member are real. 

The frequencies h iu (33) are connected with the characteristic 
equation by the relations 

(35) cos 2jt= — 

The asymptotical behaviour of x{t) is dependent on the exponen- 
tial factors f\ and the bases of which are likewise seen to be 
uniquely determined by (34). 

For later application it should be noticed that a necessary 

CO 

and sufficient condition for the convergence of S(ir(0 — m)^ and 

t=i 

CO 

S \ x{t) — m\y for any values taken on by the arbitrary coefficients 

t^i 

in the polynomials H, is that |pfc| < 1 and \qk\<l for all L An- 
other wording of the condition is that all roots of the character- 
istic equation (34) shaU lie within the periphery of the unit circle. 
In such a case, x{t) will be referred to as describing a damped 
oscillation, 

A second property of (32) will also be used later. Let the arbi- 
trary coefficients in (33) be fixed under the single condition that 
no polynomial H vanishes; then, if some pk or some is different 
from unity in modulus, formula (33) shows that |aj(f)| cannot pos- 
sibly be uniformly bounded in (— co< ^ < co). In the same way^ 
any solution which does not belong to an equation of lower order 
is unbounded if (34) presents a multiple root. Thus, if a solution 
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x{t) of (32) satisfies no linear equation of lower order and if \x{t) \ is 
uniformly bounded in (— oo <t < cx)), then \pk\ = \Qk\= m — nk=l, 
i. e. the equation (32) is of the special type (30), and xd) is of 
type (17). 


7. A purely probabilistic approach. 

A typical probabilistic hypothesis about a given time series is to 
regard the observational values as a random sample of a certain 
aleatory variable. The complete hypothetical set-up thus consists 
of a sequence of random variables, say . . rjit — 1), 7] it), + 1), 
. . . , which are mutually independent, and have identical distribu- 
tion functions, say Fix). 

Since the hypothesis under consideration consists of two elements, 
the test methods are of two kinds: a) those testing the goodness of 
fit of the hypothetical function Fix) to the empirical distribution 
function Fix) characterizing the observational series, and b) those 
testing the randomness in the observational series. 

A perfect fit to the data being in contrast to the idea of rand- 
omness, an amount of arbitrariness is in place in the choice of an 
hypothetical distribution function Fix) characterizing the aleatory 
variables rj it). 

The classical method for testing goodness of fit is the ;^^-method 
of E. Pearson (see e. g. G. tJ. Yule and M. G. Kendall (1937), 
Chapter 22). Another method has been proposed by H. Cramer 
((1927) p. 112, and (1928) p. 145), and, under the term of w^-method 
(Summenlinienverfahren), by E. v, Mises ((1930) p. 316). The latter 
method, an interesting modification of which has been given by 
N. Smirnoff (1936), does not sufEer from the well-known arbitra- 
riness implied in the %^-method. In Appendix A is given a labour- 
saving device, whereby the ca^-method is also made more convenient 
than the %^-method. 

The hypothetical randomness lies in the independence relations of 
the type (8). Accordingly, the tests of randomness are tests ex- 
amining various particular instances of these relations. For example, 
since rk = 0 for Z:>0, the serial coefficients n- must approximate 
zero for i; > 0 (the particular case A=1 is the Abbe-Helmert cri- 
terion — cf. E. Stumpff (1927), p. 8). 

An important instance of the general test problem is concerned 



ANALYSIS OF STATIONARY TIME SERIES 


[18 


with, the choice between functional and probabilistic schemes. This 
problem is dealt with in the expectance theory (see e. g*. K. Stxjmpff 
(1927), p. 115) founded by A. Schuster ((1898), (1900)) and which 
studies, i. a,, the distribution of the periodogram ordinates (X) 
obtained when substituting a set of independent random variables 
7 ](t) (cf. p. 11) for (1) in (26). Taking m = in (26), the basic 
formulae of the expectance theory read 

(36) j5[A(;i)]=j5^[B(«]=0, 0<2<7r. 

(37) 0<2<7t. 

n 

In ease the factor 4 in the latter formula must be omitted. 

Having now touched upon some typical functional and probabil- 
istic schemes set up for the analysis of time series, we are in a 
position to pass on to some intermediary schemes. Of these, two types 
may be distinguished which are of fundamentally different character. 
Since the terminology does not seem to be established, I propose 
for the two types the names » schemes of hidden 'periodicities^ and 
» schemes of linear regressions^ . The former schemes are the earlier 
ones, and are dealt with in the next section. Some critical remarks 
on these schemes follow in section 9. The chapter concludes with 
some preliminary remarks on the schemes of linear regression. 


8. A scheme of hidden periodicities. 

A simple approach of hidden periodicities is to regard an observa- 
tional time series (1) as additively built up by a sum of harmonics 
and a random sample of a certain aleatory variable. Thus, denot- 
ing by 

(38) ?(1), ?(2), ..., 

the hypothetical random variables corresponding to the observational 
set (1), we have in this case 

(39) m = yit) + rjit), 

where ?/(7) is of type (17), and is a set of independent random 
variables as used in the previous section. 
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A chief problem when applying a scheme of hidden periodicities 
is to perform a separation of the functional and the random com- 
ponents. Since we are concerned with the case when y{t) is a 
composed harmonic (17), the variate difference method, and other 
methods based on the assumption that y{t) reduces to a polynomial 
or to a trend function, fall outside of the program (cf. p. 1). 

The principal method for the search of harmonic components in 
a given time series (1) is the Schuster periodogram method de- 
scribed in section 5. The test problem concerning the significance 
of the ordinates in the empirical periodogram was already mentioned 
in section 7. 

In respect to the Oppenheim-Bruns method for separating har- 
monic components (cf. section 5), it has been emphasized by J. I. 
Craig (1916) that the method fails when a random component is 
superposed on the harmonies. This disturbing effect of the random 
error will be called the »Craig effeeU. As the Oppenheim-Bruns 
method is parallel to a method of importance in the sequel, and 
as the Craig effect — possibly because of the sketchy character 
of Mr. Craig’s paper — seems to have been overlooked in later 
literature, the point in question will be taken up in a separate 
discussion. This is done in section 28. 

The problem of separating the functional and probabilistic ele- 
ments being, of course, to a considerable extent indeterminate, even 
rough methods for the search of periodicities may be of interest. 
A proximate method, which is of particular relevance because it is 
both convenient and capable of delivering more general periodic 
components than harmonic functions, is based on the well-known 
Buys-Ballot table (see e. g. K. Stumpff (1927), p. 100): 


§ 1 , ^ 2 , : 

Sp-flj ^p + 2j ) ^2p 

(40) r - • • • 

^{k—l}p+2, , ^kp 

^kp + lj ^kp-h2j • • • y 

(41) mi, # 2 , ..., Mn-kp, mp. 


Here J? is an integer, Ic stands for the greatest integer which is 
less than nlp^ and m is the arithmetical mean of the ^:th col- 
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•umn. Denoting by h tbe number of elements in tbe i'.ih column, 
we bave 

I' ^ -f 1 for i < n — Jcp^ 

\lci=^]c for i > n — hp. 

The leading idea of the method simply is that if the series 
contains a component y{t) with period p, the values of yif) for 
t=l,2, . , p are approximately given by the means mg, . . mp. 

While the arrangement (40) was used even before C. H. D. Bijys- 
Ballot (1847), the method was developed in detail by B. Stewart 
and W. Dodgson (1879) and others (cf. H. Btjrkharbt (1904) p. 679 f.). 
A sharpening of the method by means of a periodogram construction, 
generalizing that of A. Schuster, is due to E. T. Whittarer 
(1911). In the Whittaker periodogram, p is taken for abscissa, 
and the ordinate in p equals the (weighted) variance in the series 
nii divided by the variance in the series It- 

The connexion between the two periodogram constructions is 
interesting. Considering' the Schuster periodogram, (/I) is well- 
known to be approximated by the maximum value for varying A 
and B of the expression 

- > 1 ^ _ 

(43) (|) m — A • cos It — B'shil tf. 

n t=i 

On the other hand, Professor H. Cramer, in his Course in 1933, 
showed that in the Whittaker periodogram the ordinate equals, 
apart from the constant denominator the maximum of an 

expression generalizing (43), vis:. 

(44) 1)^ (I) - - • S & -m-yp {tt. 

n 

This expression should be maximized under the condition that yp{t) 
be a function of integral period jp, but arbitrary for the rest.^ 

With the notations used in (40), the maximum value of (44) 
1 ^ 

equals - • S la • [m — m)^, which for the ordinates in the Whittaker 
periodogram gives 

^ The interpretation of the generalized periodogram as a correlation ratio is in- 
correct (cf. E. T, Whittaker and G. Kobihson (1926), p. 345 1). 
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(45) 

In case n is a multiple of (45) reduces to 

S {mi - mf/p • D' © = 1)^ m/D^ (|). 

2=1 

In his 1933 Course, Prof. H. Cramer also delivered an expect- 
ance theory of the Whittaker periodogram. He showed, i. a., 
that if the given series is a random sample of a normally distrib- 
uted variable, then I)^(:mi) follows the well-known Student’s distrib- 
ution. 

It should be observed that as a principle the Schuster and the 
Whittaker periodograms are of equal efficiency — if the former 
discovers periods in a given observational series, then the latter 
will give positive results, and vice versa. 


9. On the criticism of the scheme of hidden periodicities. 

While the hypothesis of hidden periodicities has proved very fruitful 
in many fields of scientific research, many applications early met with 
a serious criticism (cf. H. Burkhardt (1904) p. 685). The essential 
point of criticism bears upon the postulated strict periodicity of the 
individual functional components, and it has been maintained that this 
rigidity in the periodicity often has no empirical correspondence. 

The serial and autocorrelation coefficients disclose an interest- 
ing aspect of the above critical argument. If the functional ele- 
ment y{t) in (39) is a composed harmonic (17), the autocorrelation 
coefficients are for ^ ^ 0 given by 

(46) •cosA,A/(2 

i=l 1 

The hypothesis of hidden periodicities assumes, therefore, that, for 
i5;=t=0, the autocorrelation coefficient n-, too, is a function of the 
type (IT), i. e, a composed harmonic. According to the relation (19), 
the hypothesis thus implies that there exist arbitrarily large values 
such that is approximately given by the value taken by (46) for 
^=;0,, viz.' 
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(47) , kcil{2I)Krj)+2Cf), 

1 1 

This implication may, as a matter of fact, be nsed as a criterion 
of the applicability of the hypothesis of hidden periodicities. Thus, 
even though a given time series clearly shovrs a cyclical character, 
but the serial coefficients are gradually vanishing, then the scheme 
of hidden periodicities is no adequate approach. 

It seems plausible that in a good many oscillatory phenomena 
the serial coefficients actually are gradually vanishing. The above 
criterion shows that in such cases a periodogram analysis would 
give negative results. The table of serial coefficients in air pressure 
material from Port Parivin analyzed by Sir G. Walker ((1931), 
p. 528) may be referred to for illustration. The graph of serial 
coefficients presents damped oscillations of a period of about three 
years. 

It follows from the above that in a descriptive analysis of a 
time series the serial coefficients of G. U. Yule are of fundamental 
importance. J. Bartels (1935) has recently given another method 
of descriptive analysis, consisting in a generalization of the Buys- 
Ballot table, and directly constructed as a criterion of the applica- 
bility of the hypothesis of hidden periodicities. J. Bartels forms 
the Buys-Ballot table for successively extended sections of the 
given series. Let 6 [h, v) stand for the r:th section consisting of Jc 
consecutive rows in (40), and let q be the number of such sections. 
Let l)p (i, v) be the variance of the column averages nii (^, v, p) in 
the section 6 {k, r), and let their arithmetical mean in respect of v 
be Pp (^)j 

JAJc) = l- h Dl{Jc,r). 

q 

For the expression 

(48) Bp{k) = k-Jp{k)/Jp{l), 

regarded as a function of k, J. Bartels proposes the term persist- 
ency characteristic because of the following observations. 

Let the series It under investigation contain a component of per- 
sistent period, i.e. a strictly ' periodic component. If the length of 
the period equals p, and if k is large, each of the g sets mi[k,v,p), 
considering the variation with will nearly reproduce the periodic 
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component. Consequently, Jp (^) will tend to a positive constant 
as 7^ -> CO, and Bp[k) will therefore increase nearly proportionately 
with Tc. — On the other hand, if the time series is purely random, 
(7;) will show no tendency to vary with Tcy but will remain on the 
unity level. — Concerning intermediate cases, J. Bartels remarks 
that Bp [h] may tend to an asymptote above unity. Then a quasi- 
persistency is present, a tendency of adjacent rows in the Burs- 
Ballot table to show a certain resemblance, a resemblance which 
will fade away as more distant rows are compared. 

The persistency characteristics computed by J. Bartels ((1935) 
p. 519 f.) suggest persistency in statistical data concerning a) the 
half year period in the international index of terrestrial magnetism, 
b) the 24 hour wave in air pressure in Potsdam, c) the 12 hour 
component of the Batavia temperature. On the other hand, quasi- 
persistency is suggested in a) the 27 day component in terrestrial 
magnetism, b) the 24 hour wave in the magnetic east component 
in Batavia. 

In economics, the classical periodogram analysis has repeatedly 
been tried on business cycle material. The negative results support 
an opinion which has been maintained also on logical-theoretical 
grounds, and which now seems predominant, viz. that the hypothesis 
of hidden periodicities is inadequate in business cycle theory. 

In cases like those mentioned above, where the approach of rigid 
periodicity fails, the schemes of linear regression announced in 
section 7 seem to form a natural and interesting substitute for the 
scheme of hidden periodicities. Reference to previous results con- 
cerning the schemes of linear regression being given later, when 
dealing systematically with these schemes, the concluding section in 
this survey will enter into detail only as to the earliest papers on 
the schemes in question. 


10. Remarks on the schemes of linear regression. 

In the hypothesis of hidden periodicities, there is assumed a 
fargoing interdependence between the elements of the given time 
series; leaving the random component out of the question, the 
interdependence is assumed to be purely functional. The schemes of 
linear regression assume as to adjacent elements an interdependence 
only in the sense of the theory of probability. 
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In an interesting study on the variate difference method, G. TJ. 
Yule (1921) considers the autocorrelation in a series consisting* of 
iterated differences, say of order ???, obtained from a purely random 
series. Since the autocorrelation coefficients are 


(49) 


n-=(— 1)^ • 


m (m — 1) • • • (m — ^ + 1) ^ 

(m + 1 ) (m -h h) 


lc>0, 


the series of differences must present an oscillatory character, a 
feature increasing in evidence with m. In other words, we are 
concerned with a primary sequence of random variables, say 
— rj{t\ which by hypothesis are independent, 

and have identical distribution functions; on this basis a secondary 
series, say ..., — 1), 1), ..., is defined by means of 

a moving linear operation of the type 

(50) = Jo '■* — 

The approach thus defined wiU in the sequel be called the scheme 
of moving averages. 

Another particular case of moving average (50) is studied by 
E. Slutsky (1927), who forms the secondary, intercoiTelated series 
from the primary, purely random series by n iterated summations 
by two, followed by the forming of m:th differences. Holding 
constant, E. Slutsky shows that, with probability 1, an arbitrarily 
fixed section of the difference series will tend to a sine curve as 

^ 00 . This result is given as an application of a general theorem 
proved in the same paper and discussed in section 16. 

Stochastical interdependence of the type (50) is a particular case 
of linear regression. Letting the auxiliary variables ^(0 be the 
same, another type of linear regression is indicated by the following 
implicit definition of the intercorrelated variables ^ (0, 

(51) ?(e + ai?a-l)+ -- 

The approach (51) was introduced in an heuristic manner in an 
important memoir by G. U. Yule (1927). The fundamental differ- 
ence between the scheme of hidden periodicities and the scheme 
(51), which is said by Yule to define a »disturbed harmonic» is 
clearly brought out. In (39) the random elements r}(i) ^do not in 
any way disturb the steady com^se of the underlying periodic function 
or functions^ (p. 268). On the other hand, regarding (51) Yulei 
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(p. 294) states that a principal feature of a disturbed periodic move- 
ment is »a continual change of amplitude and shift of phase». 

In the same paper G. TJ. Yule applies, with success, his hypothe- 
sis to empirical data, vis. A. Wolper’s sunspot numbers. Further, 
he makes the following general statement concerning the scope of 
the scheme (51): '>'>T)istu7''bance will altvays arise if the value of the 
vainahle is affected hy exte^nial circumstance and the oscillatory varia- 
tion ivith time is ivholly or partly self-detei'mined^ owing to the value 
of the variable at any one time being a function of the immediately 
preceding values, Distw'bance^ as it seems to me, can only be excluded 
if either (1) the vainable is quite unaffected by external circumstance, 
or (2) we are dealing tvith a forced vibration and the external circum- 
stances producing this forced vibration are themselves undistmhed» (p. 295). 

In order to attain conformity in terminology, the approach (51) 
will in the sequel be descriptively called the scheme of (linear) 
autoreg^xssion. G. U. Yule (1927) restricts himself to the cases 
A:<4. General autoregression as implicitly defined by (51) was 
dealt with by Sir G. Walker (1931), whose applications to the 
air pressure data mentioned in section 9 gave positive results. 

As shown in detail by E. Slutsky (see e.g. (1937)), even a scheme 
of moving averages (50) may present waves of shifting amplitude 
and phase. Thus, both schemes of linear regression are of interest 
to the theory of those oscillatory phenomena for which the hypoth- 
esis of hidden periodicities proves inadequate. The investigations 
referred to in section 9 show that there are many central pheno- 
mena of this kind. 

While the schemes of linear regression thus form a type of 
hypothesis of the greatest importance, the development of the sub- 
ject is still little advanced, both as to the theory and the applica- 
tion of the schemes. For instance, earlier definitions concerning 
the scheme of autoregression are incomplete. One of the chief 
purposes of the present volume is to give some contributions for 
completion in these respects. It also aims at bringing the schemes 
into place in the theory of probability, thereby uniting the rather 
isolated results hitherto reached. 

In the theory of probability, the schemes of linear regression 
fall under the heading of the discrete stationary random process 
as defined by A. Khintchine (see (1932) and (1933)). As a matter 
of fact, the concept of stationary random process is extremely 
general, and the restrictions involved arc only those indispensable 
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in hypotheses concerning stationary phenomena (cf. p. 3). It is 
therefore but natural that the scheme of hidden periodicities, after 
a slig*ht change in the interpretation, will also be found to form 
a special discrete stationary process (see section 15 12). Accordingly, 
the theoretical developments start with a chapter on the discrete 
random process. This analysis will, i.a., deepen the insight into 
the nature of the schemes of linear regression, which are dealt 
with in Chapter III. Chapter IV contains some applications of 
different hypotheses to observational time series. 

In Chapter IV, the serial coefficients of the observational data 
play a fundamental role. The question concerning the quantitative 
significance of such correlation coefficients is taken up in Appen- 
dix B. Following an argument presented in a preliminary note 
(see H. Wold (1936)), it is shown that these coefficients, and like- 
wise all correlation coefficients obtained from a broad class of time 
and spatial series, have their quantitative significance conditioned by 
the size of the statistical masses to which the data refer. 


CHAPTEE II. 


On tlie theory of the discrete stationary random 

process. 

11. Definition of the stationary processes. 

In the theoretical analysis of a statistical time series, we may 
distinguish between functional and probabilistic approaches. In the 
former, the time series is represented by a function of time which 
in the general case is univalent, but otherwise unconditioned. In 
the latter, the most general approach is the unconditioned random 
process. From a purely mathematical viewpoint, the random process 
is a random variable in an infinite number of dimensions. Denoting 
by {it} the set of time points t in which the phenomenon changing 
with time is studied, each element in {^} corresponds to one di- 
mension in the random variable. 

Let {it} stand for a set of values taken on by a real parameter 
t, which wiU be spoken of as representing time, and let one random 
variable correspond to each time point t in {^}. Denoting the 
given set of random variables by {§(©}, let it be assumed that the 
following conditions are satisfied. 

(A) . Choosing arbitrarily a sub-set in {^}, say (C = (^i, . UX the 

combined variable ?(W] will be well-defined. 

Let the distribution function of . ., 4) be denoted by . ., 4; 
%, . ., so that 

(52) . ., 4; Wi, . ., -Wn) == P[§(4) ^ ?(4) ” ^ W7i]> 

and let the sets of distribution functions and probability functions 
of the variables ^(4, . 4) be denoted by {P} and {P} respectively. 

(B) . Letting (4 = (4 j • -5 arbitrary sub-set in {t}, and 

4i, . ., in) be an arbitrary permutation of the sequence (1, 2, . ., n)y 
the functions {P} will satisfy the following relations identically in 

j Un^ 
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(o3) F {ti^^ ^ 2*25 • *5 * -5 • *5 

(54) F{tl^ . ., tm] • *7 • -7 %7 * *7 ^W^7 "f . -5 4" 

where m< n. 

These relations express merely that the probability laws ruling 
the variables {^(t)} must not contradict themselves. Accordingly, 
these relations will be referred to as the consistency relations. 

Following A. Kolmogoropp {(1931), (1933)), a set {^(fl} satisfying 
the conditions (A) and (B) will be called a random process. 

According to a fundamental theorem of A. Kolmogoboff ((1933), 
p. 27), a set {jF} belonging to a random process {^(6} defines a 
probability distribution on those sets in a space JRt of an infinite 
number of dimensions {i^}, which are formed by an enumerable sum of 
Borel’s cylinder sets in Ft- For instance, if the sequence t^t—1, 
^ — 2, ... is contained in the set {jf}, the probability P[^{t)^UQ, 
^(f — 1) :< ^(^ — 2) ^ . . .] will exist for any real sequence 

*2^07 ^l7 ^27 * * • 

In order to define stationarity, we must consider arbitrary trans- 
lations within the set {^}. In doing this, we can assume that {t} 
either consists of all real lvalues, or is formed by an unbroken 
sequence of equidistant values, say . . — 1, 0, 1, 2, . . . In any case, 
a random process {^(©} as defined by a set {i^} is termed stationary 
in the sense of A. Khintchine ((1932) — (1934)), if for an arbitrary 
sub-set (fl = (?^i, . ., tn) in the relation 

(55) F(_t-^~t' tj “b tj • • •, tii~\~ t^ = jF(^jl, t^-j • • •, tn\ ^3_, • •, ^n) 

is identically satisfied in Un and in t. Again following 

Khintchine, the process will be called discrete, if t is restricted to 
a sequence of equidistant values, continuous if t is arbitrary. 

According to the interpretation indicated in section 2, the variables 
5(0 considered in the above definitions may be taken to be multi- 
dimensional. Thus, just as in the case of ordinary random variables, 
a ^-dimensional random process {5(6} ^nay be looked upon as 
obtained by combining h one- dimensional processes, say {5^^^ (6), - - 
{5^^“U0}. Now, in studying simultaneously a group of one-dimensional 
processes, we shall always assume that an arbitrary finite sub-group, 
say {5^'»l(6}, . . {?^'*^(6}, can be combined into a ^-dimensional 
random process. For instance, considering an infinite sequence 
{^^^^(6}, {5^^^ (6}, . . this assumption may be expressed as follows. 
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Taking’ out arbitrarily a group fixing arbitrarily 

a set of time points 4, and a double real sequence where 

r = 1, 2, . Z;; s= 1, 2, . n, we shall assume that the probability 

P[g(V) {Q < if); r -= 1, . 5=1,.., n] 

will exist; further, we shall assume that these probabilities will 
satisfy all consistency relations of type (53 — 54); in case of station- 
arity we shall also assume that all relations of type (55) will hold. 

Considering the variables § (t) which constitute a discrete random 
process {?©}, let us choose t arbitrarily, and form the sequence 
— 1), 2), etc. In conformity with the above, this 

sequence will be denoted when dealt with as 

a I’andom variable in an infinite number of dimensions. Such 
variables being used frequently, we shall adopt the short notation 
= t — 1, t — 2^.X In case the process {^{t)} is stationary, 
it is evident that the probability distribution of is independent 
of and — what is likewise of importance — that the distribu- 
tion of will be uniquely determined by that of 

Expectations derived from the distribution functions {P} determin- 
ing a stationary process {?(©} will be called characteristics of the 
process. The characteristics are, of course, independent of t. Fur- 
ther, considering the distribution functions Fd; u) in the set {P}, 
these will be independent of t. The function of ti thus obtained 
will be termed the principal distribution function of the process 
considered. By definition, the mean (m), the dispersion (P), etc., of 
a one-dimensional stationary process are given by the corresponding 
characteristics as obtained from the principal distribution function 
(cf, (5) and (6)). 

If the dispersion of a one-dimensional stationary process is finite, 
the automoments of second order as defined by (cf. (10) and (11)) 

^ E[^{t) • 5(i^ + >!:)]= J uv, du,vF{t, t + Jc; u, v)== 

will be finite. The characteristics mentioned determine the auto- 
correlation coefficients belonging to the stationary process {?(©} 
considered (cf. (12)), 

(56) rk = n (p = 

If n(^ = 0 for all 4= 0, the process {?(©} will be termed non- 
autocor related. 


3 — 38387. H. Wold. 
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A. Khintchine (1932) gives also a more embracing definition of 
the stationary process, requiring only that the characteristics m, D 
and shall be independent of t. This case will be referred to as 
the generalised stationary process. 

Let {?(C} be a stationary process, and consider the variables 
t^ri) and ^(^ 1 , . ., W which refer to the time points 
tm, ‘-5 4)- We shall sometimes have to regard the process {^(0} in 
the set {tm+i, . U as conditioned by the behaviour of in the 

time points (^i, . ., tm)- Following a familiar terminology, we shall 
then speak of the variable W as being conditioned by 

?(^i, W. Indicating conditionality by an index (7, and denoting 
by O' the condition obtained from C by replacing throughout U by 
ti + t^ it is evident that for an arbitrarily fixed t the two conditioned 
variables tn) and laitm+i + 4 + t) will have identical 

distribution functions if the. process is stationary. The reader 

is referred to A. Eolmogorope ((1933), Chapter V) for the funda- 
mentals concerning conditioned variables and distributions. 

Generally speaking, a characteristic of a conditioned variable 
depends on the conditioning variable, and will be called a conditioned 
characteristic. Since such a characteristic forms a function of a 
random variable, it constitutes in itself a random variable. For 
instance, considering a one-dimensional process {§(0}, and taking as 
before . ., tm) to be the conditioning variable, the conditioned 
expectation of ^{tm+-d is, by definition, the expectation of 
Denoting this expectation by -Ba (^m+i)], a general formula gives (see 
A. Kolmogoroff (1933), p. 47) 

(57) J?[J?e[?(Wi)]] = J5g(Wi)] = m(©. 

From now on, when not explicitly stated otherwise, the random 
processes dealt with are tacitly understood (A) to ie one-dimensional, 
(B) to he discrete, and ' defined for integral time points, (C) to have a 
finite dispersion. Of course, the confinement to integral time points 
instead of a general equidistant sequence, say 

• • *5 ^0 a, t^, t(^ a, t^ ^ a, .. . 

involves no restriction as to the generality of the theory of the 
discrete process. 
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12. A tlieorem of A. KHOTCHINE, 

As far as I am aware, the only earlier investig^ation of the 
o^eneral discrete stationary process is that of A. Khintchine ((1932), 
(1933)) already referred to. Though the problems dealt with in the 
sequel lie along entirely different lines, the principal theorem of 
KHiNTCHijiTE will be quoted in full because it discloses a fundamental 
property of the stationary process. The theorem in question states 
that the stationary process is subjected to the law of great numbers, 
viz. in the following sense: 

Let — 1), . . — n +1) he a finite sequence of variahles 

connected tvith a discrete stationary process {^(0} ivith finite dispersion y 

1 

and put Sn = ~ • S ^{t—i). The dispersion of the difference 
n ^=o 

then tends to zero ivhen w— »co and Further ^ processes may he 

constructed so that the asymptotical decrease is arbitrarily sloio. 

In view of the applications — in particular certain questions 
concerning ergodic hypotheses — it is an interesting problem whether 
in the sums S the sequence i = 0, 1, 2, ... can be replaced by the 
sequence i^ < ^^ < ^^ < • • • A short reflection on the singular processes 
as defined and exemplified in sections 14 — 16 shows that such a 
general sequence is not allowed. 

The problems dealt with in the sequel have their points of 
connection with certain investigations on the continuous process 
and on special types of the discrete process. These earlier investiga^ 
tions will be referred to in the course of the analysis. 

The concept of stationary process as introduced by A. Khintchixe 
is extremely general. As a scheme for the analysis of time series 
it will be found to embrace aU the schemes mentioned in the survey 
given in Chapter I. For the sake of concreteness it will be of 
interest, before passing to the general theoretical developments, to 
show in detail how these may be obtained by suitable specializa- 
tions. To ■ this end a preparatory analysis of the general process 
will be useful. Accordingly, the next two sections will be reserved 
for some groundwork concerning operation with the discrete stationary 
processes. 

13. Some fundamental operations with random processes. 

In this section it will be shown that certain familiar operations 
with ordinary random variables can also be performed with random 
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processes. In discussing the situation, no restrictions will be laid 
on the processes considered. Generally speaking, the operations 
will give rise to new random processes. Moreover, i£ the processes 
dealt with are stationary, the resulting processes wiU be found to 
be stationary. It will be sufficient for our purpose to consider 
functions of random processes, and the forming of limit processes 
in convergent sequences. 

Denoting by ^ a random variable in a ^-dimensional space JRi, 
and by f[x] a function which is finite and BoEEL-measurable in 
Bh, and whose values are lying in a space i2p, it is known that 
f[^] will be a well-defined random variable in Bp (see e. g. H. 
Ceamee (1937), p. 12 f). Let us next consider a combined variable 
consisting of random variables in JJi*. Forming 
the variables it is evident that these in the same way 

may be combined to a random variable [/[^^^^], . ., /[?^^^]] (cf. also 

p. 10). 

Thus prepared, let f[x] remain the same, and consider the 
variables | W == ^(Wl which constitute a random pro- 

cess {^(6}. According to the above, the variables [/ [^(^i)], . . 
/[^(^ 7 i)]] will be well-defined. Denoting by the set of distribu- 

tion functions of these variables, it is also evident that the func- 
tions F"" satisfy all consistency relations of type (53 — 54). Further, 
if {?(0} is stationary, the functions F* also will satisfy (55). The 
variables [/[?()^i)], . . will thus constitute a random process, 

and if {^(6} is stationary, the process obtained will also be stationary. 
The resulting process will be said to be a function of the process 
{g(0}, and be denoted by {/[§(©]}. The variables of type [/[^(6], 
/g a — 1)], . . filit — ^)]] will be denoted hj . . ., t — n)]. 

In particular, considering a random process {^(©} obtained by 
combining h one-dimensional processes, say let 

us take / to be linear. The operation will then give rise to a sum 
process of type {ai Denoting this sum by 

{^k(f)}, we shall write 

According to the above, we have 

(58) = ..,« + •■■ + U 

Cft [fl = [fl + fla ra + ••■ + M. 
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Most of the functional operations dealt with in the sequel are of 
this simple kind. As to the processes combined, 

these, too, are for the most part of a simple structure. We shall 
next present two type cases. 

Let represent a set of random processes and 

let a set of time points (t) = tfi, . . ., 4) and Jc sets of real numbers 
(^{«)) == ^(s))j s = 1, , k, be chosen arbitrarily. Then the 

processes will be called independent if the following relation 

is satisfied 

P [gh) (y < . . , g(i) (4) ^ <); . . . ; (4) < = 

= P (y < ^ <)] .... Pimt,) < . . , mtn) ^ <)]. 

Similarly, a sequence .... will be said to consist 

of independent processes if the processes in every finite subset 
(i^)} are independent. 

Now, let it be assumed that the independent processes 
are stationary, and have finite dispersions Then the sum 

process {Cib(6} as defined by (58) will be stationary, and the expect- 
ation, the dispersion, and the autocorrelation coefficients of 
will exist. We have P&] = 4- • • * + akE[^^% 

and, as is readily verified, 


(59) 


© = a? (§«>) + at + • • • + 


Of course, the two latter relations depend on the identities 
(60) T (gw {t ± p); ± ^)) = 0, > 0, g > 0, 

where r and s are arbitrary. Having stated this, {gb)} and 
will be termed non-correlated or uncorrelated if the relations (60) are 
satisfied. Evidently, the relations (59) hold under the broader as- 
sumption that any two processes {gW} and {gW} are non-correlated. 

In order to define the second type case, let us consider the vari- 
ables g(45 • - j 4) which constitute a random process {g(^)}. Choosing 
arbitrarily an integer i, let us form a second type of variable, say 
rC^i, • •, 4), by taking . ., 4) = g(4 + k, . ., 4 + 0. Evidently, 
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these variables constitute a random process. Now, denoting this 
process by + Jo)}, a short reflection shows that we may combine 
the processes {^{t)} and -f In fact, the distribution func- 
tions ruling the simultaneous behaviour of {^(C} and -f are 
uniquely determined by the distribution functions {F} ruling the 
process and it is further evident that the resulting distribu- 

tion functions will satisfy all consistency relations (53 — 54). More- 
over, if {^(t)} is stationary, the combined process will be stationary. 

The arguments being perfectly general, we can form the pro- 
cesses {?®}, ~ 1)}, . . ., {^it--h)} and combine them into an (A-f 1)- 

dimensional process. Now, if {^(C} is one-dimensional, we can 
apply a linear operation of type (58) to the combined process. The 
result will be a process, say {^(0}, such that the corresponding 
variables . ., tn) and l^[t] will satisfy relations like 

(61) « = + 

*1" ^ (^j[ 1, . ., 1) ~i“ * * “f" dh ^ (^1 h, . ., tn *”■ 1%) 

(62) ^[/] = — 1] H- 'rah^[t — 1il 

If is stationary, then will also be stationary. 

The operations considered above may also be applied to observa- 
tional time series. Letting . . ., f^—i, . . . represent such a 

series, and transforming by means of a function f[x], the resulting 
series will read . . ., /©—i], /[|i+i], .... If, in particular, every 

it consists of a couple of Ic observations, say ^ |p))^ and if f[x] 
is linear, the transformed series, say . . ., i, it, &+i, . . ., will have 
for general element 

C< = + flg If > + • • ■ + ttk |f>. 

On the other hand, assuming ft to be one-dimensional, the trans- 
form (61) corresponds simply to a moving linear operation. In this 
case the general element in the transformed series reads 

(63) it~ aoit-^ -r auit^h- 

In the theoretical developments, we shall frequently have to con- 
sider sums of type (58) when the number of terms tends to infinity. 
For use in such connexions, we need a suitable definition of con- 
vergence. To this end we shall next extend the concept of con- 
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^ vergence in probability, as introduced by F. P. Cantelli (1916), so 
as to apply to a sequence of random processes. 

A sequence ... of ordinary random variables is said to 

converge in probability to a random variable ^ if for every £ > 0 

P[|^n~51>£] 

tends to zero as ^ oo . A necessary and sufficient condition for 
sucb convergence is that, for every £ > 0, there exists a number n 
such that for an arbitrary q> 0 the following inequality holds 
(see A. Kolmogokope (1933), p. 32). 

Using the familiar interpretation of as the distance between 

two points X and 2 / in a multi-dimensional space, the definition of 
convergence also holds in case the variables are ^-dimensional, say 
= [^(^?)^ . ^ = Of course, an equivalent definition 

is that for every £ > 0 the probability 

p[ir.">-§ii<£, .... 

tends to unity as ^ 2 ~»oo. Now, considering the elementary ine- 
quality of G. Boole, 

(64) P[l|W-r.|>e]<l-P[|§(«)-?,i<5, 

^ P[||(«) - I > d + • ■ • + P[| - I, I > «]. 

it is evident that a necessary and sufficient condition that con- 
verges in probability to ^ is that converges in probability to 
for r = 1, , >5; (see P. P. Cantelli (1916) and E. Slutsky (1925)). 

Thus prepared, let a sequence of random processes be denoted by 

(65) r>(0}, {pee},.... 

The sequence will be called com^ergent in probability to a limit pro- 
cess {^(^)} if for an arbitrary set (6==((^i, . ., 4) the sequence 

( 66 ) 


be convergent in probability to the limit variable . ., 4). 
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Theorem 1, A necessary and sufficient condition that a seqtience 
(65) of random processes he convergent in prohahility is that for an 
arhitrary t the sequence 

(67) .... 

he convergent in prohahility. If the sequence (65) is convergent^ and 
if every process is stationary, the limit process will he sta- 

tionary. 

The necessity of the condition is implied in the above definition 
of convergence in probability. Next, let {t) = {ti, . 4) be arbitrarily 
fixed, and consider the sequence (66). According to the above 
application of the inequality of G. Boole, the convergence of (66) 
is implied in the convergence of (67) for every t in the set {t-^, . ., tA 
Having stated this, let the limit variable be denoted by . ., i^), 
and consider the limit variables belonging to all possible sets (Y) = 
= (i^i, . W. Since the consistency relations of type (53 — 54) are 
satisfied for every process in the sequence (65), the same relations 
must be satisfied in the limit. The variables ^{t^, . ., tn) v^ill thus 
constitute a random process, say and the same argument 

shoves that the limit process {^(t)} is stationary if every process in 
the sequence (65) is stationary. 

The following corollary needs no comment. 

Corollary. Letting {^(0} represent a random process, the sequence 
(65) converges in prohahility to {^(0} if, and only if for an arhitrary 
t the sequence (67) converges in prohahility to ^{t). 

In dealing with stationary sequences (65), the theorem proved 
above will be particularly useful, for the behaviour of (67) in re- 
spect of convergence will then be independent of t. Considering, 
in particular, the sum process {C&W} defined by (58), a necessary 
and sufficient condition for convergence in probability as ^-->oo 

CO 

is that the sum 21 a/ is convergent. If convergence takes place, 
2=1 

00 00 

the sums and will be said to converge. 

i=l i=l 

A sufficient condition of convergence is, of course, that the sum 

. n-f-p 

S at • jC is convergent, and that the dispersion of 
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n 14] 

tends to zero uniformly in as m— ^ co. Moreover, writing and an 

n 

for the mean and dispersion of S ar it follows readily that 

if these conditions are satisfied, it wonld imply a contradiction were 
not the mean and dispersion of the limit variable given by lim mn 

n — ► CO 

and lim On respectively. 

72 . — ► 00 

As a second application of theorem 1 we make the following 
observation. Denoting by {^(0} an arbitrary discrete stationary 
process, let a sequence (65) of processes be defined by 

{|(^) «)} = [..., + 1), 1(0, 1(2 _ 1), . . g (2 _ 0, 0, . . .]. 

Then for every fixed (0 = (^l, • W we have lim (^i, • 4) = 

i — ► CO 

= 5(^1, . tn). It follows that converges in probability to 

(«}• 

Extending a current terminology, two processes, say {^^^^(6} and 
will be called equivalent^ if for an arbitrary = . ., t^i) 

the two variables t^i) and 4) are equivalent, i. e. if 

If two variables or processes are equivalent, we shall write 
= = etc. 


14, On singular stationary processes. 

Let ^ represent an ^^-dimensional random variable 

with distribution function F{Ui, , thX The distribution of ^ will 
be called linearly singular or, more briefly, singular, if there exists 
a linear function, say L[x — m] == a^ + • * • + — niti), 

such that 

(68) P [i [? - 4= 0] = 

= P [a^ + • • • + — m^i) 4= 0] = 0. 

If (68) is satisfied, the variables will be said to be connected 
by the relation P[^ — m]==0. The singularity will be said to be 
of rank h, if there exist n — h, and only n — /^, independent rela- 
tions between the variables say 
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f ^1, /i+l * + • • • + an, /i+1 • = 0 

(69) 

1^1.^ • + + - mn) = 0. 

It will be relevant to interpret singularity in terms of cbarac- 
teristic functions. Denoting the characteristic function of ^ by 
/(Xi, . X,i), we have by definition 

f(X„ . Xn) = ?«' + •■ = 

_ J + • dF{Xi, . ^n) • 


Letting a matrix of real elements be given by 

I ciix^ ai2 5 • 

(70) 

and writing {Z) — (> 
let the substitution 

(71) 


^113 

^12 3 • 

) Cll n 


^2U 

^22 3 • 

1 a^n 



an2’) • 

> a>nn ^ 


5 Xn) 

for an 

auxiliary set 

= ^11 


• ' ^ ai 

n * Zn 


X-n — anl * Xi + ttnii * Zn 


transform /(X^, . XJ into /’''(Xj, . XJ. Considering the variable 
^ defined by 




II 

g(i) 


anl 

. ^{n) 

(72) 


.... 


• • 

. . . 





^(1) 

-l_ . . . 

"t" ann 


an i 

slementary 

transformation 

shows that 


(73) 

nZl,-; 


•+^n 

■ !(”>)] = 

E[e^^ 


Thus /* is nothing else than the characteristic function of the com- 
posite variable Now, let ^ be singular of rank /i, say on account 
of the relations (69), and introduce for a moment the inconsequential 
assumption mi = Q. In such case (Z^y , ,, Zn) will contain at 
most the variables X^, . X/i. On the other hand, if the matrix (70) 
is non-singular the first h variables must all appear in /*(Xi, . ,, Xn). 
In fact, since the distribution of ^ is uniquely determined by its 
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characteristic function (see e. g‘. H. Cramer (1937) p. 104), the van- 
ishing of Zm in for a number m would imply == 0, a 
linear relation not obtainable from (69). 

It will be observed that formula (73) holds for arbitrary values 
of the real coefficients aa-* In particular, the characteristic function 
of the h<n first variables as given by (72) will be arrived at 
by putting anc = 0 for 7c = 4- 1, 4* 2, . . 

If ^ is singular on account of a relation of type (68), and if 
exists, we shall always assume that 

(74) = 

which evidently will involve no loss of generality. Now, calculating 
from (72), and paying regard to (74), we get 

i mt — ajjL ^ an imn = 



mn = ainmi + 1 - 

Next, let it be assumed that all the variables have a finite 
dispersion. Then the expectation 

JE [[Zi -m,) + -- + Xn (r”’ - mn)f] = 

= / [ZiCaJi — »%)+•• + XniXn — Xn) 

will exist, and represent a non-negative quadratic form in the real 
variables Xt, say Xg, . ., XJ. Writing 

fiik = X [(^<*') — mi) — ?%)], 
we have fiik==l^ki, and 

nn 

(76) Q (Xi, X„ . Xn) = S S fxik • X,* X,. 

The quadratic form Q will give information concerning the singu- 
larity of the distribution of In fact, the ranlc of the q^iadratic 
form equals the ranJc of the distribution. In other words, by a suit- 
able substitution in Q of type (71) the number of variables may 
be brought down to the rank of but not further — andi vice 
Denoting the transformed forms by we have also 

(X (Xi, ..,Zn) = E {[Z^ - m?) + • • ■ 4- ^ 
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SO the forms Q'^' thus are quite analogous to the characteristic func- ^ 

tions /'*' in respect of linear substitutions. As the proofs, too, may / 

be given a parallel wording, the verification of the statements made 
needs no comment. 

After these preliminaries, the next step will be to investigate as 
to stationarity the variables tn) connected with a stationary 

process {^(0}. Until further notice, we shall not assume that {^{t)} 
has a finite dispersion. As before it will be sufficient to consider 
the variables of type t — 1, , f — n). Let it first be observed 
that if t — t — n) is singular on account of the relation , 

(77) g (6 — m + — 1) m) H b ah • — h) — m) — 0, 

where then (a) the relation of singularity (77) must hold for | 

every t, (b) if h <n, the variable f — 1, . t — h) must also pre- 
sent the same singularity. Further, if t—1, . ., t — h) satisfies i 

a second linear relation of type (77), say with coefficients (ai, . ah), 
we obtain by subtraction a linear relation showing that there is a 
K < h such that ^ — 1, . ., t — /^') is singular. Thus, when con- 
sidering a stationary process {?(©}, a number h will be well-defined 
by terming {§(0} singular of ranJc h if in the sequence 

^( 0 , ?«, ..., ... 

the variable t — 1, . t — h) is the first singular one. Taking 
(77) for the singularity relation of t — 1, . t h), it is readily 
seen that 

(A) the coefficients at are uniquely determined, and an H= 0. I 

(B) for all A > 0 the [Ic -b h -b l)-dimensional variable f {t^ 4- h, J 

^0 + ^ 1, . is singular of rank h, and a system of ^ + 1 

independent relations is given by 



+ ■ (1 % — 1) — m) + ■ 

• • + a/i • 



• 

— ft) — ni) 

= 0 


w + • (^(#0 + ifc — 2) — wi) + • 

• • + aji' 



■ (? (^0 + ^ ~ ^ 

— 1) — m) 

= 0 

1 (io + ^) — m 

+ % ■ (1 (^0 + * — 1) — m) + • 




• {1% + h 

— h) — m) 

= 0 
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Thus, if (77) holds the variable • •> defined by (61), taking 
<^0=1, will reduce to E[Q]. Having stated this, the rela- 

tions of type (78) which follow from (77) in the case of stationarity, 
may be comprehended in the following identity 

(79) -- m + «! • (?[^ -- 1] m) + • • • -t ak - h] - m) = 0. 

The term relation of singularity will be used also for (79). 
According to observation (B), a sample value 

^i{tQ + Z;, io + ^ — 1, . ., t^ — h) — [^iito + Jc), liitQ + Ic — V), . ., — M 

will, when regarding ^ft^ 4* as a function of t, with probability 1 
satisfy the difference equation (32). Thus, if this difference equa- 
tion reduces to (30), any sample series 

Uto + ^ - 1 ), . mo - h)] 

will with probability 1 be of type (17), i. e. consist of a number of 
superposed harmonics. Speaking in this case of a process of super- 
posed harmonics^ we have the following theorem. 

Theorem 2. Let {§(0} ie a discrete stationary process ivith auto- 
correlation coefficients Tk. If {^(©} is linearly singular^ {^(©} is a 
process of superposed haimonics. A necessary and sufficient conditmi 
that {§(6} he linearly singular^ say on account of the relation L[^(t)—m]==0 
given by (77), is that rjc satisfies the difference equation L[rk] = 0. 

Denoting by m and a the mean and dispersion of the process 
considered, the autocorrelation coefficients are given by (cf. (56)) 

(80) rjc =■ r-k = E [(^ it) — m) (§ it — Jc) — m)] / ol 

Thus the quadratic form of type (76) belonging to ^(t, t—1, . ., t-~nX 
say QniXt, . ., Xi-n\ will be well-defined, and given by 

n n 

QniXtj . Xt^n) = O’" * S S rijj— g| * Xi~^ Xt—q. 

2?==0 ^==0 

Since the form Qn is non-negative definite, its principal determinant 
will be non-negative (see e. g. G. Kowalewski (1909), Chapter 12), 
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1, 

^4? 


• . 1 rn 



ru 

1. 

ri, . . 

• . , r-ji — 1 

(81) 

z/(r, n) = 



1, .. 

■ ‘^*11 — 2 




rn-h 

rn—% * 

• 1 


The determinants z/(r, n) defined above will be called the princiigal 
correlation determinants of the stationary process examined. 

Thus prepared, let us begin with proving the second part of the 
theorem. In the first place, let {§(©} be singular on account of 
(77), and multiply the left member of this relation by 
Observing that the expectation of the resulting expression is zero, 
and paying regard to (80), we get 

(82) L[r]^ = r-k + a^ rh^i n—u = 0. 

This relation shows that the condition is necessary. 

On the other hand, let the autocorrelation coefficients rt satisfy 
a linear difference equation L[rt\ = 0 of order Tf. Transforming this 
equation to the form (32), and reducing to lowest possible order, 
say /i, let Li[rj = 0 be the result. According to the previous ana- 
lysis, will then satisfy no linear difference relation of order < h. 
After this remark, let the consecutive rows of z/(r, h) be denoted 
by po, pi, . . ., Qh- From the structure of J (r, h) it is evident that 
these rows are connected by the linear relation [p/J == 0. Thus 
^ (r, }i) equals zero, so the rank of z/ will be 1%. Recalling from 
the theory of quadratic forms that the rank of z/ (r, n) equals the 
rank of Qn^ and keeping in mind the identity between the ranks 
of corresponding distributions and quadratic forms of type (76), it 
turns out that t — 1, . t — 7^) is linearly singular. Let the rela- 
tion of singularity be it) ~ m] = 0. Now, were not = Li, 
then, contrarily to the assumption made, n would satisfy a linear 
difference relation of order < h. Moreover, according to the con- 
struction of Li we have = L* [LiW], where L* is a well- 

defined linear operation. Thus L — m] == X* [L-^ [^(t) — m]] ~ 
= X* [0] = 0, which proves that the condition is sufficient. 

In order to prove the first part of theorem 2, let (77) be the 
relation of singularity, and let it be assumed that this has already 
been reduced to the lowest possible order. According to the above, 
n* will then satisfy (82), but no linear equation of lower order. 
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Writing n on the form (33), this implies that none of the poljaiomials 
H will be vanishing identically. Since, finally, the inequality | ?7:| ^ 1 
shows that Vu is uniformly bounded in modulus in (— ^<"k < co), 
we conclude from the second remark in section 6 that can be 
written on the form (17). 

After this analysis, the following theorem will need no comment. 

Theorem 3. Let {^(6} he a discrete stationary process ivith prin- 
cipal correlation determinants n) given hy (81). A necessary and 
sufficient condition that {^(©} he singular of rank h is that z/(r, h) 
he the first vanishing determinant in the sequence 1), 2), . . . 

A relation of type (77) will be called a stochastical difierence rela- 
tion of order h satisfied by the stationary process The pre- 

vious analysis shows that a stationary process which has finite 
dispersion and satisfies (77) will satisfy a difference relation of type 

(83) - .9) + ^ + 1) + . . . q. == 0. 

In the next two sections it will be shown, i. a., that the theorems 
of the present section are not vacuous, i. e. that there really exist 
stationary processes having the properties assumed by hypothesis. 


15. Some type cases of the discrete stationary process. 

As mentioned in section 12, it will be shown in the present sec- 
tion that the schemes surveyed in Chapter I may be regarded as 
special cases of the discrete stationary process. Some other schemes 
will also be presented, and a few characteristic properties of the 
different types considered be pointed out. Conditioned variables 
and expectations, and the operation of addition will be exemplified, 
and devices given for the construction of model series which follow 
the different schemes. For further concreteness some model series 
constructed for the illustration of later results will be furnished. 

a. The piirely random 'process. This term will in the sequel be 
used for the purely random scheme touched upon in section 7. 
The purely random process will, of course, be obtained by taking, 
in the relation (52) defining the general process, 

F{t^, . tn] u-ti) == F{u^, 
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where any distribution function may be chosen for The veri- 

fication of (53) — (55) is obvious. 

When detailed information is required, a purely random process 
{^(6} defined by a distribution function Fiu) will be denoted by 
F)}. It is seen that the defining function F(m) is identical 
with the principal distribution function of the process. 

The following simple theorem exemplifies the operation of addi- 
tion of independent processes. 

Theorem 4. Let (^; ... represent indepen- 

dent, purely random processes such that the infinite convolution 

(84) F^^^ ^ F^^^ ^ . . . . 

is convergent. Then the sum 4- +"• will 

he convergent, and constitute a purely random process with the con- 
volution (84) for principal distribution function. 

In fact, the convergent convolution (84) is the distribution func- 

cc 

tion of the sum 12 (j;\ F^'^^), which is thus convergent. According 

1=1 

to a remark attached to theorem 1, the convergence of this sum 

00 

implies the convergence of S 

Z=1 

A characteristic property of the purely random process is that 
the two variables ^ 0 (^ 1 , . 4) and ^{t^, , tf) will have identical 
distribution functions for any condition {C) not referring to any time 
point in the set (f) = {t^, . ., W. Thus we have in this case (cf. (57)) 

Any random series, e. g. a series of records on throws with a 
die, will form a model series of the purely random process. The 
illustrative model series used in the present study are very 
simply constructed, the double purpose being to facilitate the cal- 
culations, and to bring into relief the characteristic features of 
the different types of process. For this construction, the well- 
known random sampling numbers of L. H. C. Tippett (1927) were 
used. 

Two independent random series, denoted by and {dT), will be 
given for illustration. Denoting the corresponding processes by 
{a^^^it; i^i)} and F 2 )} respectively, the defining function F^ is 

given by 
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Fi (.u) == 0 for u < — 1, (u) — *1 for — 1 < w < 0, 

Fiiu) = ‘9 for 0 < 1, = 1 for l :^u, 

and the function jPg by 

jPg iu) =. 0 for u < — 1, F 2 (u) = *3 for — 1 < < 0, 

F^iu) = ‘7 for 0 :< «^ < 1, F 2 iu)= 1 for 1 ^ u. 

A short calculation shows that the mean value of each process 
equals zero, and that the variances, say Dl and Dl, are 

(85) = % D| = -6. 

Writing z for the Tippett numbers, the model series con- 
sists of the 1000 elements obtained from the first 1000 r-numbers 
oh page 1 by the use of the following code : 

= 1 for T = 0; ai^ = 0 for z—1,. 8; ot^ == — 1 for -r == 9. 

The second series was obtained from the corresponding 'r-numbers 
on page 2. The code used was 

— 1 for T= 0, 1, 2;, 0 for t== 3, . 6; 

for ir=7, 8, 9. 

The first 100 elements in each of the a-series will now be quoted. 




Table 1. 

(1) Model series 

(«"’) ; first 

100 elements. 




0 

-1 

0 

0 

0 

0 

0 

0 

0 

-1 

-1 

0 

-1 

0 

-1 

0 

0 - 

-1 

0 

-1 

0 

-1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

-1 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

1 

1 

0 

0 

-1 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

1 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

1 

0 

-1 

0 

-1 

0 

0 

0 

0 

-1 

0 

i 

0 

0 

1 







(2) Model series 

(a"’); first 

100 elements. 




•1. , 

1 

0 

0 

1 - 

-1 

0 

-1 

-1 

0 

0 

-1 

0 

1 

1 

0 

0 

0 

-.1 " 

0 

-1 

0 


0 

1 

0 

1- 

-1 

:0 

■ 0 

-1 , 

-1 

0 

0 

0 

0 

0 - 

-"1. 

1 

1 

' 1 

-1' 

1 

1 ■ 

-I'.o 

1 

0 

0 

’ 0 ; 

r-l 


-1 

1 

0 

1 

1 


-i 


1 

,1 

, 1. 

i‘ 

1. 

0 

1 

0 

i 

1 

0 ' 

, 1 

1 

0 

1 

1 


•^1 ■' 

-1 


-1 

0 

T-l 

— 1-/ 

0; 

0 

0 

1 

1 

1 

-1 

0 

0 

1 

0 

-^1 ' 

d 

-1 

-1 

1 

-1 


4—38387. H. Wold. 
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The ^-series will he used repeatedly in the present study. For this reason each 
of the series, considered a statistical population with hypothetical distribution func- 
tions and F^{u) respectively, has been tested as to the goodness of fit between 
the empirical and the theoretical distributions. Denoting the empirical distribution 
functions by and respectively, these were found to be 


f-P,(w) = 0 for u<. — 1, 

Ft(M) = 100 for — 1 < M < 0, 

(86) 


If, («) = ’908 for 0 < M < 1, 

Fi(tt) = 1 for 1 ^ m; 

(Foiu) = 0 for u < — 1, 

Fa (m) = ‘321 for — 1 < < 0, 

(87) ] 


IF 2 (m) = ’692 for 0 <m<1, 

F 2 (m) = 1 for 1 m. 


The w^-test indicates a nice fit. As is readily verified by the insertion of (86) 
and (87) in formulae (322) and (323), the two a-series give the o) ^-values *000064 and 
*000505, while the corresponding expectations are *000180 and ‘000420. 

The model series {a) have also been tested with regard to the concordance between 
serial (n) and autocorrelation {rk) coefficients. The latter vanish for ^ =[= 0. On the 
other hand, writing n for the number of elements in the series, and paying no regard 
to terms of order l/w, the sampling dispersion of any autocorrelation coefficient of 
the two series is found to be l/■)/^^ = *032. The first five serial coefficients are 
given below. 



k=l 

k—2 

CO 

11 

II 

II 


‘057 

‘047 

‘010 

'015 

*036 

n-(«J”) 

'046 

‘Oil 

*006 

~*004 

-004 


The deviation from the corresponding autocorrelation coefficient is in no case 
larger than the double dispersion. It is rather interesting to note that although the 
series consist of as many as 1000 elements, a serial coefficient amounting to *06 can- 
not be considered significantly positive. 

The first 20 values of the series {a^l^ ^ obtained from table 1 
are given below to illustrate a linear operation with independent 
random processes: 

( 88 ) -1 -2 0 0 -1 1 0 1 1 -1 -1 1 -1 -1 -2 0 0 -1 1 -1 

The series thus obtained is a model series of the process 
Since in the present case is found to possess the same 

defining distribution function as — the series (88) also 
forms a model series for the process . 

We shall next pass to some other type cases of the discrete 
stationary process, denoted by y and <J, which wiir be built up 
by successive linear operations on the purely random processes. 
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p. The p7'ocess of moving averages. According* to the analysis 
in section 13, a stationary process {^(C} will be obtained by taking 

(89) 6o • ^ (e + \ • nd-i) + • • • + 

letting {pit)} represent a purely random process, say [p [t; F^jiu)]} ^ 
and (6) = (& 0 ) ^h) an arbitrary sequence of real numbers. The 

type of process thus defined will be called the process of moving 
averages. A specific process (89) of this type will sometimes be 
denoted by {^it; p)}. The purely random process [pit)] and the 
variables ^ (^i, . W will be called primary in respect of {^it; p)} 
and ^ (^ 1 5 . . , 4) respectively. 

Let it be observed that, for any constant c>0, the variable (89) 
is identical with that defined by the variables p [t; F^rj ic • u)) and by 
the sequence c • (&) = (c • Jo? ^ ^ Therefore, the assumption 

&o=l often imposed on (89) in the following will not restrict the 
generality of the analysis. On the other hand, if JS&i + O, we can 
find an identical process such that S bi — 1. Hence the name proposed 
for the process. 

The principal distribution functions F^iu) and F^jiti) of {^it)] and 
[pit)} respectively are connected by the relation 

(90) F^iu) = F<niulb^ ^ ^ ¥: F^iuHjf 

In case I) {pit)] is finite, we obtain further 

(? it)) = ibl + bl + -- + hl). {pit)). 

For exemplifying conditioned variables connected with a process 
{^it;p)} of moving averages, let m<h, and let C denote the con- 
dition 

iC)={p it — m--l)=-pm+i, pit'-m — 2)==pm-h2, . . pit — h) = pff 
Then the conditioned variable it) will be given by 

it) ==bQ- pit) + • • +bm'pit—‘m)+ bm+l * Pm+l + • • +bh*ph • 


As is readily verified, we have in case D{pit)) is finite, 

( 91 ) Fc (©] = Hq 4" * • 4" bm) F [p ®] 4" * pm+i 4" * • • 4- 6a • ' 


/op ^ 
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Model series for the process of moving* averages are readily 
obtained by applying a moving linear operation to a model series 
for the purely random process. For instance, the series of differ- 
ences of any order Tc; obtained from a purely random series 

fit will illustrate the process of moving* averages. Below, as an 
illustrative . model series are given the first 100 values of = 
— aT obtained from Table 1 . The corresponding process 
wiU be denoted 

( 92 ) {§ (t; it -f I; (it; F^} , 


Table 2 . Model series i^t); first 100 elements. 


0 

-1 

0 

1 

-2 

1 

-1 

0 

1 

0 

1 

-1 

1 

1 

-1 

1 


1 

0 

-1 

-2 

2 

0 

-2 

2 

-1 

0 

d 

-1 

0 

0 

0 

0 

-1 

1 

-1 

1 

0 

— 1 

1 

-1 

0 

1 

0 

0 

1 

0 

0 

—2 

1 


-1 1 1 p -1 0 0 -1 1 -1 

0 1 0 b 0 0 -1 2 0 0 

0 2 -1 1 0 —2 0 2 0 0 

0 -1 1 0 0 -2 0 0 0 1 

0 1 - 1 - 11-1 0 2-2 2 


y. The general process of linear regression. Let { 7 ]{f)} stand for 
a purely random process with finite dispersion D (t^), and let Sq, 

QO ■ 

^ sequence such that S bl is convergent. Finally, 

. , A:=0 

GO . 

we must assume either that E[if]] = 0 or that S&fc be convergent. 

k=0 

These alternatives will present themselves repeatedly in the sequel. 
The former assumption is better suited to our purpose; the modi- 
fications caused by the latter are trivial. Accordingly, we shall 
always assume that — 


- Considering the series 

( 93 ) ri it) + 1 ) 4 - 62 P it— 2 ) + ...., 


it follows from the independence of the variables rj it) that the 
variance of 

(94) + • • • 4* &n+2> 

is given by / ■* 

ihi^bnli + --^bnlp)^DKri). 
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The dispersion of (94) thus tends to zero uniformly in jp as 
Accordingly, the sum (93) is convergent (cf. p. 41). As remarked in 
connexion with theorem 1, it follows that we may define a 
stationary process {^(C} by writing 

(95) = + + 

By definition, this is the general formula for a process of linear 
regression. 

Having dealt under article /? with a particular case of the gen- 
eral process of linear regression, we shall in the next article build 
up the scheme of autoregression as another type case. As a matter 
of fact, in the present study the general process will be chiefly 
used as a convenient tool for comprehension of the two type cases 
mentioned, and it is only these which will appear in the applica- 
tions. It will also be sufficient to refer to the type cases for 
model series. 

6. The process of linear autoregression. Let (a) = stand 

for real numbers such that ^ 0, and that the roots of the 
equation (34) all are of a modulus less than 1. Let further 
(6) = . . .) be a sequence such that (A) the difference equation 

(96) x{t) xij: — 1) A h ah'X{t’—h)==0 

is satisfied when x{t)=T)u and (B) the initial values l>h are 

solutions of the following system of linear equations 

+ hi = 0 

a^ "t" * 6x “b 0 

(97) I ; ; ; ; ; ; : : ; ; 

aji—i + ah—2 ’ + ■ * ' + • hh-—2 "b hfi^i = 0 . 

aji 'i' ah—i • &i + + * hh—i -b = 0. 

Since the determinant of the system (97) is given by 

1 0 0 - • 0 0 

ai 1 0 • • 0 0 

^2 (Xi 1 * - 0 0 

aji-r-l ' % 1 
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and thus equals 1 , the initial J-values are uniquely determined. 

00 

It is also seen that all h are real. Moreover, the series S is 

0 

convergent (cf. section 6 ). Letting {^2 ( 6 } represent a purely random 
process with finite dispersion, the conditions indicated under article 
y thus will be satisfied. Hence, a stationary process {^( 6 } = {^(.t;7])} 
will be defined by putting 

(98) = + 1)+ + 

By definition, this operation gives rise to the general process of 
(linear) autoregression. Since au = 1 = 0, the autoregression will be said 
to be of order h. 

As pointed out in section 13, the variables ^(6 defined by the 
following linear operation on the variables ^ (t; tj) given by (98) 
will likewise constitute a stationary process: 

^it) = 1) + •• • + 

It will now be shown that the process {^(0} thus defined is equivalent 
with {t] ( 0 }. The proof is based on a transformation of a double sum 
of aleatory variables. 

By definition, we have 

at 71 hjc- r}(f---i--Jc) ===lim. S H bk ‘ r] (t i — ^), 

e-=0 fc=0 N— CO z=0 fc=0 

where Uq and Iq should be given the value 1. Introducing an 
auxiliary variable ^iv( 0 , an elementary transformation shows that 

h N h — 1 p N h 

^A'-(f) = S fliS hk'r]{t—i—'k) = ll‘i](t—p)i!i cu-g + S i]it—p)hag ip—g + 

i=zQ Jc=o p=o g=0 p~h 5=0 

N-^h h 

+ S rjit—p) S ttq bp-q — r}{f) + Ci - rjit — N— 1) -1 1- Ca • r}{t—N—h), 

i)=iV+l q~p^N 

where the second transformation is a consequence of (96) and ( 97 ). 
Putting a = max I I, the coefficients Cs introduced are seen to 
satisfy the inequality 

(99) I Cg I < a • ( I bjsf+8—h I + I ijsr+s-j-i-h I + • • • + I I + I b]sr\). 

Paying regard to the convergence of 2 6 |, we conclude without 
difficulty that 7] (t — N-^l) + ‘ . 4 - Ch7]{t — N— h) tends to zero 



II 15] SOME TYPE CASES OF THE DISCRETE STATIONARY PROCESS 55 


in probability as N ^ . Thus, tends to 7 ](t) in probability. 
According to the corollary of theorem 1 , the proof that {^(6} equals 
is thereby completed, and we get the following fundamental 
identity, 

(100) {?(«} + {g«^l)} + ••• + an^{Ut-m = {^(0}. 

The relation ( 100 ) implies, i.a., that 

(101) + ak-^it-h)=^ 7 ](t). 

Observing further that 7 ](f) is independent of 

and — according to ( 98 ) — also of 1), — 2 ), . . ., the rela- 
tion (101) shows that the variables ^(6, -- 1), . --h) are 

connected by a relation of linear regression. Hence the name 
proposed for the process. 

A simple illustration of conditioned variables is given by 

?C 7 (t) == 7 ] if) — — ^2 £-2 * • * -^Clh 

where (0 = (^^ - 1) = = 

More general formulae will be given in section 23 . 

Construction of model series for a process of autoregression, 
given for example by ( 98 ), may be performed in the same way as 
in the case of a finite moving average of a purely random series. 
The difficulty of an infinite number of weights &§ is but apparent, 
for when a certain precision in the calculations is fixed, only a 
finite number of the weights, say H of them, will be found to have 
any influence. 

I Denoting by dt the values in a model series of the present type, 

and representing the primary model series by (ci), the formula for 
the construction reads (cf. ( 63 )) 

di = 4 * • at—i + 62 * 2 + • * • + hs: * • 

Having constructed d^, dg, . . ., Sh according to this formula, the 
subsequent values fe+i, dir-f-2, etc. may be obtained from the more 
convenient recurrence formula 

(102) dt — at — Ajl • di — 1 — * di~-2 — * * * ^ ah' dt— a. 

In the illustrative series given below, a slight simplification has 
been made, in that formula (102) has been applied also for f = 1, 
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2, . S', taking 6V= 0 for ^<1. As .the series consist of IGOO 
elements each, this modification of the first few elements will not 
have any disturbing effect upon serial coefficients and other quanti- 
ties relating to the whole of the series. On the other hand, thanks 
to the modification adopted, the construction of the model series 
may be readily followed in detail. 

Three illustrative series, denoted by (df) and (d/f) respectively, 
will be presented. The formulae of type (100) for the corresponding 
processes read ' ‘ ^ ' 

(103) (6} - (©} - 

(104) = 

(105) ■ (t)} = (t)} + * 2 {d^^^ a - 1)} - W . 

The verification of the recurrent calculation of the d-series needs 
no comment. 


Table 3. (1) Model series (dVX First 50 elements. 


I'OO 

•20 

— *16 

*13 

*90 

-1*72 

1*37 

-2*10 

‘68 

— *54 

*44 

-1*35 

1*08 

*14 

*90 

- *72 

*67 

- *46 

— ‘64 

*51 

-1*41 

1*12 

-1*90 

1*62 

- *22 

‘17 

*86 

-1*69 

1‘35 

-1*08 

- ‘13 

- *89 

*71 

- *67 

*46 

- *37 

*29 

-1*23 

1‘99 

- *59 

1*47 

-2*18 

2*74 

-1*19 

- *04 

1‘04 

- *83 

*66 

- ‘63 

- *58 




(2) Model 

series 


First 50 elements. 


*00 

-1*00. 

- *80 

— *64 

- *51 

- *41 

- *33 

- *26 

— *21 

-1*17 

-1*93 

-1*65 

-2*24 

-1*79 

-2*43 

-1-95 

-1*56 

-2*25 

—1*80 

-2*44 

-1*95 

-2*66 

-2*06 

-1*64 

-1*31 

-1*05 

- *84 

*33 

•26 

'21 

*17 

*14 

*11 

*09 

*07 

*06 

- *95 

- *76 

— ‘61 

- *49 

- *39 

- *31- 

- *26 

- *20 

- *16 

- *13 

~^ i*io 

- *88 

- *71 

— *66 



(3) Model 

series 

(df X 

First 50 elements! 


1*00 

1*20 

- *41 

*86 

1*09 

-■ * 22 ' 

— '*76 

- i -01 

"■■-^■‘7l' 

*51 

*56 

- 1*22 

- *61 

1*67 

1*73 

- ‘74 

- 1*27 

*28 

— *13 

- * 1 '^, 

- *95 

- *08 

- *40 

- *03 

1*25 

*27 

*24 

- 1*13 

- *38 

*66 

~ *62 

— 1*65 

*09 

1*03 

:■ ■•15 

■^'•64 

;,-^,*22 

- *63 

i :02 

1‘61 

* 66 ' 

- 1*92 

' *19 

2 * 28 . 

-.; 67 ;\ 

- *62 

*31 

,,■•■‘■^47: 

“ ‘11 

^ 1*32 
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t:. On the periodie processes. In this article, a statioharj process 
will be constructed which, belong’s to the class of singular processes 
as introduced in section 14. A distribution function and an 
integer being arbitrarily given, there will be constructed a sta- 
tionary process, say with the following properties: (a) the 

process will be singular by means of the relation 

(106) = 

(b) the process has :F{u) for principal distribution function. Accord- 
ing to the construction, any sample series, say . . .)i will be 

strictly periodic, and with period A. The process will, accordingly, 
be termed a periodic process. ' 

The simple construction device reads as follows. Let 
represent an A-dimensional aleatory variable such that (A) the distribu- 
tion function of say . .,Wn), is symmetrical in respect of 

the variables Wi, (B) all the variables have F{u) for distribution 
function. NoWj let a sequence of multi-dimensional aleatory variables 
be defined by 

n [r^’,1®], . 

r", n, [I'",-., r", I", I®, r“], 

ir’,.., I®, 


It is evident that these variables may be taken for the variables 

^{t,t-x..., t-h + ix 

t — h + 1, t—h)y ^ {ty . + 1 , t — hyt- h’-' 1 ), . . 

l{ty . . .y t —h + ly t --hy . .y t — 2h + l)y 

^{ty . .y t—h + ly t --hy . .y t'—2 h ly t — 2}i)y . . . ^ . . . , 


connected with a stationary process {^(6}. It needs no comment 
that this process has the advanced properties (a) and (b). 

Bor exemplifying conditioned probability distributions connected 
with the . periodic process constructed above, let G stand for a 
condition implying === ^, and let n represent an integer. Thep 
Fci^it + n • fe c: 0] = 1 if It belongs to the set 0, otherwise Fc == 0; 
Further y-FcitU 

Bor the constfuction of a model series, say Tzrg, . . . illustrating 
the periodic process defined above, it will be sufficient to form a 
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model random sample ^ 2 > • -5 basis of the distribution 

function F(u), and then take 5^A4-2 = ^ 2 ? Still 

simpler, a model series of this type has been obtained by letting a 
coin- throw decide whether tc^ should equal 1 or — 1, and by taking 
^ 2 i‘= — 5^2i+i = 5^i- The resulting series reads 

(107) 1-1 1-1 1-1 1 --1 1-11 

This series will be referred to as the inir-series. 

Considering the difference equation (14) corresponding to the 
relation of singularity (106) of the periodic process constructed, 
this equation has for general solution a sum of harmonics, viz. 
the expression (15), a circumstance in full agreement with theorem 2, 
This aspect corresponds to interpreting a sample series 7tt by means 
of rouBiER analysis as a composed harmonic (cf. section 5). 

Having so far shown that the class of singular stationary proces- 
ses introduced in section 14 is not vacuous, the next question is 
if there exist singular processes which satisfy relations (77) of a 
more general type than (106). As a matter of fact, any particular 
relation generalizing (106) will impose certain conditions upon the 
distribution functions {F} of the singular process. Leaving open 
the question of which special distribution functions may present 
themselves in case of a special singularity of type (77), we advance 
that the normal process (see section 16) will be found to admit 
any relation (83) as a singular case. After this reference to a process 
of superposed harmonics, only one of the schemes surveyed in 
Chapter I, viz. the scheme of hidden periodicities, remains to be 
interpreted as a stationary process. 

i3. The process of hidden periodicities. Let 
represent independent stationary processes. According to section 

13, the sum {^(6} = {^^^^ (6} + h (0} will constitute a stationary 

process. If at least one of the processes is a periodic pro- 

cess, or a process of superposed harmonics, {^(6} will be called a 
process of hidden periodicities. In particular, letting ^ = 2, and 
taking for { (5} a process of superposed harmonics, and for 
{^^^^(6} a purely random process, we get the simple scheme of 
hidden periodicities dealt with in section 8. 

A model series for the process of hidden periodicities may be 
obtained from independently constructed model series for the purely 
random process and the periodic process. Taking one model series 



II 15] SOME TYPE CASES OF THE DISCRETE STATIONARY PROCESS 59 

of each type, the series obtained by summing corresponding ele- 
ments will form a model series for the process of hidden periodi- 
cities. The table below gives the first elements in two model 
series for the processes and defined by 

the a- and yr-processes being defined in corresponding articles of 
the present section. The two 13-series consist of 1000 elements 
each. The construction may be followed in detail by means of 
table 1 and the series (107) (cf. formula (39)). 


Table 4. (1) Model series first 100 elements. 


1 -2 

1 

-1 

1 

-1 

1 

-1 

1 

-2 

0 

-1 

0 

-1 

0 

-1 

1 “ 

-2 

1 

~2 

1 ~2 

1 

-1 

1 

-1 

1 

0 

1 

-1 

1 

-1 

1 

-1 

1 

-1 

0 - 

-1 

1 

-1 

1 -1 

1 

-1 

1 

-1 

0 

-1 

1 

-1 

2 


1 

-1 

1 

-1 

1 - 

-1 

2 

0 

1 -1 

0 

-1 

1 

-1 

1 

0 

1 

-1 

1 

-1 

1 

-1 

2 


1 ~ 

-1 

1 

-1 

1 -1 

1 


1 

0 

1 

“2 

1 

~2 

1 

~1 

1 

-1 

0 

-1 

2 - 

-1 

1 

0 






(2) 

Model series 

first 100 elements. 




2 0 

1 

-1 

2 

-2 

1 

-2 

0 

-1 

1 

-2 

1 

0 

2 

-1 

1 - 

-1 

0 

-1 

0 -1 

0 

-1 

2 

-1 

2 

-2 

1 

-1 

0 

-2 

1 

-1 

1 

-1 

1 - 

-2 

2 

0 

2 -2 

2 

0 

0 

0 

1 

-1 

1 

-2 

0 

-2 

2 

-1 

2 

0 

0 - 

-2 

2 

0 

2 0 

2 

0 

1 

0 

1 

0 

2 

-1 

2 

0 

1 

0 

2 

0 

0 - 

-2 

0 

-2 

1 -2 

0 

-1 

1 

-1 

2 

0 

2 

-2 

1 

-1 

2 

-1 

0 

-1 

0 - 

-2 

2 

-2 


In spite of the fact that the process of hidden periodicities 
is not the only type of sum of independent processes dealt with in 
the sequel, it seems superfluous to go into further details in this 
introductory section. In conclusion, the following simple relations 
involving conditioned variables and expectations will be mentioned 

i3c' {t n • li) = Ttt + w * A). 

nm = 7tt^ 

Here {ft{t)} and {^(f)} represent independent stationary processes 
with sum {i3(©}. The process {n:(J:)} is assumed to be periodic with 
period /i, while n is an arbitrary integer, and C stands for the 
condition ((7) = (yr (6 = 
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16. On the normal stationary process. 

With, reference to A. Kolmogoboff, A. Khintchine in a paper 
(1934) recurred to in the next section touches upon a continuous 
stationary process constructed by means of normal distribution 
functions. Mutatis mutandis^ the same construction device will 
supply a discrete stationary, process. In the sequel, this process 
will be termed the nprmal process. As a basis for illustrating* the 
general stationary process,, the normal process will prove very 
useful. In fact, in spite of the formal developments connected 
with the general normal process being of a simple structure, the 
normal process will, by proper specializations, be able to illustrate 
any I type of stationary process mentioned in the previous ; section, 
and besides — as already advanced — the singular processes 
satisfying relations of type (83). 

Before going into details concerniug the normal stationary process, 
it will be convenient to introduce the concept of a general normal 
distribution in an enumerable set of variables. 

Let an infinite quadratic form with real coefficients be given by 

(108) 

p=i 9=1 

In the following analysis, the variables X.i may take on any real 
values. Under such circumstances, the form Q will not always be 
convergent. If divergent, the form must be interpreted symboli- 
cally, viz. as the comprehension of all finite forms, of type 

. . ., Xn).= . . ., A[n, 0, 0, . . .). 

Thus prepared, let (A) = and (B) any determinant J{n) 

of type 



Mill ^125 • • 

• 5 f^ln 

J{n) = 

/^21) i^22) * * 



P'Tilf P'n 27 • 

- • P'un 


be noh-hegative. Then, taking for (m) = (m^, wig) ' • •) an arbitrary 
real sequence, a function -Ag, . . .) in m enunierable set of 

variables (X) = (Xi, Xg, . .) will be defined by v 
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*'-S mp-Xp-|-Q(X„ J,,...) 

(109) fix,, X„...)^e 

As often as Q is diverg'ent, this function must be interpreted 
symbolically in the same way as Q. Now, the function Xn) 

defined by 

^ 1 

2* • 2 * ^p 2 * ^71 ’ ’* ^71^ 

/^(Xi, ,., XJ=/(Xi, x,.^ 

is the characteristic function of a certain normal, ^-dimensional 
aleatory variable, say (see e. g. H. Cramer (1937) p. 109). 

We have = nii, and — rmSi == The form 

Qn will thus be of the type (76). 

In case z/(w)=i=0, the variable will possess an ab- 

solutely continuous distribution, and the density function, say 
^n(%, • •, ^n), will be given by 


(pn ('^1 j • • ) '^7i) ■■ 


i2nf'^Vj{n) 


- 2 <?„ («!,.. .V_ 


Here is defined by 

qn (^<1, . . , Wn) = s S * {Up — nip) {uq — m^), 

P=1 q^i ^\fn) 

where Jpq{n) stands for the minor of ^{n) belonging to the element 
fXpq. Considering, on the other hand, the case z/(??) = 0, let it be 
assumed that J{'ti) is of rank h< n. Writing the relations of 
singularity on the form (69) after a suitable arrangement of the 
variables it follows from the previous analysis that the variable 
defined by (72) will be composed of h, and only h, non- 
constant variables On the other hand, for i> h the variables 
reduce to the constants m* given by (75). Expressing the 
characteristic function of in the variables Xz, this will 

reduce to 

( 110 ) fliZ,, .,,Zn)=e 

Next, let US consider the infinite sequence . .]. 3y 

construction, every finite sub-group will possess a well- 
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defined probability distribution, and, further, satisfy all consistency 
relations of type (53 — 54). Thus ^ will constitute -a random variable 
in an infinite number of dimensions. A variable ^ of this type will 
be termed normal The function f will be called the characteristic 
function of This term is justified by the evident fact that the 
distribution of ^ is uniquely determined by /, and vice versa. 

A necessary and sufficient condition that a normal distribution 
as defined by (109) may be taken to define a variable ^ft — 1, . . . .) 
connected with a stationary process {^(6} is that (a) mp reduces to 
a constant, say m, and (b) is a function of —g, say fip^q. 
In fact, taking Xt, i, Xt— 2 , ... for variables in the characteristic 
function (109), the coefficients of Xt—p and of Xt-p^Xt-^q will be 
independent of t when, and only when, the conditions (a) and (b) 
are satisfied simultaneously. 

According to the condition (A) attached to (108), we have 
fXp—q = fiq--p==fi\p.^q\. Further, disregarding the case of empty 
determinants ^ in\ the second condition, z/Oz) > 0, is seen to imply 

> 0. Thus a set of real numbers rh^r~i will be weE-defined by 
putting 

rk = iikl!^0‘ 

In terms of these the conditions will reduce to the 

inequalities (81). 

According to the above, the general formula for the characteristic 
function of the variable — 1, . . .) = connected with a normal 
stationary process {^(6} is given by 

CO q-2 ® 

ZTO. 2 p • S S ' ^t—p^t — q 

( 111 ) f(Xt,Xt-^u.f^e P=^<^-0 


where (A) u > 0, and rk are real, and (B) the coefficients Tk satisfy 
the inequalities (81), viz. z^(r, ^)>0. 

It is seen that the normal stationary process defined by (111) 
has for principal distribution function the normal distribution 
function 


0{U) 



u X 




^ {oc — m)^ 




and that the coefficients r* appearing in (111) are nothing else than 
the autocorrelation coefficients of the process. 
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Thanks to theorem 2, the existence of a singular normal process 
satisfying (83) will be proved if the autocorrelation coefficients of 
any composed harmonic satisfying (30) are such that 

(112) S 

p=0 9=0 

for any n, and for any real sequence (X) = (Xu Xt+u • . •). 
For then, this form, and the quantity m obtained from (30), satisfy 
the conditions for introduction in the general formula (111) of the 
characteristic function of the variables ^(t,t — 1, . .) defining the 
normal process, while the dispersion cr > 0 may by chosen freely. 

The remaining proof involves no difficulty. Denoting by x (f) an 
arbitrary composed harmonic satisfying (30), let a set of integers 
be given by 0 < 4 < • • • < in. Employing an argument used by 

A. Khintchine (1934), the obvious relations 

1 ^ 

(1 13) 0 ^ S [Xi— {x{t — ii -{-s) — wi) + Xt—L{x(t — i 2 ■f s) — 771 ) + • • * + 

jy s=i 

+ X<-/^ {x it— in + s) — m)]^ = 

= 2 S Xt-j' • Xi-i • - 4 - S (i:c it~ip + 5) — m) {x {t—iq + 5) — m) 
q^l ^ y iV s=i 

wiU define a non-negative definite quadratic form in the variables 
After this observation, let X“->oo. Then, disregarding a 
constant factor, the coefficient of Xm^ • Xt—i^ in the quadratic form 
will, by definition, tend to the autocorrelation coefficient I which 
belongs to the composed harmonic considered (cf. (12)). Since the 
form will remain ^ 0 also in the limit, and since the sequence 
ip is arbitrary, the limit inequality implies (112). 

Some circumstances connected with the singular processes merit 
particular attention. In the first place, assuming Z/[^(® — wa] = 0 
as given by (77) to be the relation of singularity of the lowest 
possible order, the distribution of — 1, ^ — ft) will be non- 

singular. In other words, if in a sample series 

(i^o “h ^0 *t" ^ — 2, . . ., ^0 — 1} • . ^ 

the values b«o ~2 are known, neither nor •• 

will be uniquely determined. On the other hand, if the sample 
series section §io-i is known, then and, recurrently 
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Sfo+i, • will, with probability one, be uniquely determined by 

means of ~ m] == 0. Thus, ^t,+t msbj be regarded as a solution 

of the difference equation (32) subject to the initial conditions 
--j According to the theory of difference equations (cf. 

section 6), the periods pk of the individual harmonics in ^to+t will 
be determined by the coefficients m in (77), and therefore be the 
same in all sample series connected with the process considered. 
On the other hand, the amplitudes Cjc and the phases g)k will be 
determined by the initial values. As mentioned above, these contain 
a random element, so the amplitudes and the phases of the indi- 
vidual harmonics constituting £to+t will vary from one sample series 
(. . • • *) another. Of course, any expectation connected 

with the varying phases and amplitudes, e. g. -E[(7|], forms a charac- 
teristic of the process, and may, considering e. g. a normal process 
defined by (111), be expressed in terms of m, a and n (cf, p. 73). 

The above remarks apply, of course, to every singular process, 
and thus both to the periodic processes and to the singular normal 
processes. If the distribution of t — h) is absolutely 

continuous, which is always the case in the normal process, a more 
precise conclusion may be arrived at. In fact, let it in such case 
be assumed that not all of the individual harmonics in (17) would 
be present in i- 6* that at least one harmonic would have a 

vanishing amplitude Gk- Then must satisfy a difference equation 
of order A — ■ 1 having a general solution satisfying also (32). Since 
there are only h such equations at most, the sample series sections 
• •? having the property assumed will form a set of 

Boeel measure zero in the space of — 1, t — h), keeping in 
mind the absolute continuity assumed, it follows that with proba- 
bility one all individual harmonics really will present themselves 
when writing a sample series ^to+t on the form (17), 

The investigations of E. Slutsky (see (1927) and, e. g., (1937)) 
and V. Eomaxovsky ((1932), (1933)) concerning the » sinusoidal limit 
law» fall under the theory of the stationary process, and present 
some parallelism with the previous analysis of the concept of 
singular process as introduced in section 14. Translating into the 
terminology of the present study, these authors investigate certain 
sequences of; stationary processes, say v - - Denoting 

by the autocorrelation coefficients of the process and 

by x{t) a function. (17) satisfying a linear relation A [a; (©■-- 
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the conditions imposed on the sequence imply that 

as p—^ 00 (see V. Romanovsky (1932), Theorem D). Representing' 
by a section in a sample series of and holding 

n fixed, the sinusoidal limit theorem asserts that, for sufficiently large 
values of p^ the section considered will, with a probability as close 
to one as desired, approximate a function of type x{t) with any 
prescribed accuracy. 

A few reflections on the previous analysis will verify this theorem. 
Writing mp ==‘ let an auxiliary set of processes 

... be defined by Vlh- 

• (0 — It follows from the conditions imposed on that 

the variables + © wiU tend in probability to zero as p-^co. 

It remains to prove that the composite variable + 1), . . 

+ n)] wiU, for any fixed n, tend in probability to (0,0, . .,0) 
as p—^cc. This,^ however, follows at once from the Boole inequal- 
ity (64). 

p 

By examples of the type == S Upi • {a{t — i)}, E. Sltjtshy 

and V. Romanovsky show that their theorems are not empty. 
While the added processes a used by Slutsky are aU of the purely 
random type, Romanovsky gives other examples as well. The 
recent paper of E. Slutsky (1937) already referred to contains 
references to certain related investigations, and illustrates in full 
detail the behaviour of model series of the processes (0} con- 
sidered. In full agreement with the sinusoidal limit theorem, 
sections of moderate length in a model series approximate composed 
harmonics (17) of the proper type. The periods of the individual 
harmonics are the same in different sections, while the amplitudes 
and phases vary. These features are seen to be analogous to the 
properties of the singular processes proved above in connexion with 
the singular normal process. This parallelism is not accidental. 
In fact, an analysis of the sequences studied by E. Slutsky and 
V. Romanovsky will show that these are convergent, and that the 
limit processes are singular. — In section 25 is given, i. a., a general 
device for the construction of sequences ruled by the sinus- 

oidal limit theorem. 


5 — 38387. H , Wold . 
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17. The autocorrelation coefficients as FOURIER constants. 

In a paper already referred to, A. Khintchine (1934) studies 
what in a continuous statistical process with finite dispersion cor- 
responds to the autocorrelation coefficients in a discrete process, 
viz. the function defined for any real u by 

B{u) ^E[{Ut) - -E®) • + u) - 

where represents the continuous stationary process considered. 

Terming Ii{u) the correlation function of the process, Khit3-tchine 
gives, i. a., a necessary and sufficient condition that a function 
jR {u) be the correlation function of a continuous stationary process. 
Slightly modifying the result, the condition is that there exists a 
distribution function, say V {x), such that V (0) — 0, and 

(114) B(u) = f cos ux ' dV(x). 

0 

The inversion formula (see e. g. H. Cramer (1935), Theorem 9) 

V(x) = — J B (u) a u 

0 u 

shows that the function V(x) is uniquely determined by E{u). 

We shall first give a corresponding theorem on the discrete sta- 
tionary process. It should be observed that the same theorem holds 
for the generalized process (cf. section 11). 

Theorem 5. Let n- (^ = 0, ± 1, ±2, , an arbitrary sequence 
of constants. A necessary and sufficie^it condition that there exists a, 
discrete stationary process with the rfs for autocorrelation coefficients 
is that the r^s are the Fourier constants of a 'non-decreasing func- 
tion, say W(x), such that Tr(0) = 0; W{rt)==Tt, 

(115) n = ~ J coslx-dWix). 

TV Q 

Before passing to the proof we give the following inversion for- 
mula involving a converging sum (see e. g. F. Hausdoree (1923) 
p. 245) : 
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(116) W{x) = X ^ 2 • n'— sin lex, 

h—l ^ 

With, suitable agreements as to points of discontinuity, the func- 
tion Wix) thus will be uniquely determined by the autocorrelation 
coefficients. In the sequel, W (x) will be called the generating func- 
tion of the autocorrelation coefficients rj:. 

The proof will use some facts concerning definite quadratic forms, 
facts parallel to the properties of definite functions used by A. 
Khintchine in the continuous case. In other respects the proofs 
are coincident. 

For a verification of the necessity of the condition given, let 
be an arbitrary discrete stationary process with finite disper- 
sion, say a. Put — m, let n represent the autocorrelation 

coefficients of the process {^(0}, and consider the quadratic form 


(117) I S n = 0, 1, 2, ... 

p=0 q~0 


For any real sequence . . ., Xt^ we have (cf. (113)) 


■ mj 




1 


G p=i q=i ^ ^ p=l ^ y ^ 


X. X. . 


This relation implies that the forms (117) are non-negative definite. 
Next, according to a theorem of G-. Herglotz^ (1911), this state- 
ment is equivalent to saying that the following system of trigono- 
metrical moments, 

1 27?; 1 27t 

(118) “ / QO^Jcx* dW ix) = n; — / ^inlcx^ dW {x) = 0, 


has a non-decreasing solution W{x) with TF(0) = 0. 

The inversion of (118) gives exactly (116). The required relation 
(115) follows directly from (116), and since (116) further gives 


^ Prof. T. Caeleman has kindly informed me that the formulae ( 118 ) may he ob- 
tained directly from the Hilbeet representation of a definite quadratic form of gen- 
eral type. 
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W(2 7t-~x) = 2^ — x-'2 - S sin 

k=l k> 

we obtain Wix) = 2 n: — W(2 tv — xX and, finally, W (n) ~ tt. 

On the other hand, if n is given by (115), the Herglotz theorem 
asserts that the forms (117) are non-negative. Thus, the relation 
(112) holds, and there exists a normal process with the given rr 
values for autocorrelation coefficients. 

The theorem proved above permits of some general conclusions 
concerning the autocorrelation coefficients of a discrete stationary 
process. Considering the coefficients n* for large /^-values, their be- 
haviour will depend on the continuity structure of the gener- 
ating function W{x). In order to study the situation in some 
detail, let 

W(x) - (x) + • W^^Kx) 

stand for the well-known (cf. e. g, H. Cramer (1928), p. 59) repre- 
sentation of W (x) as the sum of three uniquely determined, non- 
decreasing functions such that (0) = (0) = (0), (tt) == 

= 1 ^ 7 ( 2 ) _ p^(3) ^ ^( 1 ) + ^( 2 ) + ^(3) ^ jL, > 0, and 

1 ) = 

2) the saltus function, is equal to the sum of saltuses 
of W (a?) at all the points of discontinuity which are less than or 
equal to X] writing for the saltus points of W {xX and 
cl - Tc! 2 for the corresponding saltuses, then 

3) the singular function, is a continuous function 
which has almost everywhere a derivative equal to zero. 

Thus prepared, we put 


and obtain 

(119) 

The components thus uniquely determined by the resequence 


- / cos hx- dW^^{x), i= 1, 2, 3, 

*■ TT 0 

ri = a" -ri” + a® • rf + a® -rf. 
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via its generating function (cf. (116)) are of entirely different cliar- 
acter. 

As to we have (see e. g. H. C. Carslaw (1930), p. 271) 


a 


( 1 ) 


1_ ^ dW ix) 
ft 0 dx 


cos hx dx 


0 as CO. 


The component r/f is given by 
(120) -S cl- cos 1,- 7c. 

Z i~i 


It is seen that is an almost periodic function. Again refer- 
ring to (19), we conclude that arbitrarily large i-values exist for 
which rT approximates 


r(2) = i If ® („) 

ft 


1 y,. 


The singular component permits of no unconditioned state- 
ment as to its behaviour for large ^’-values. 

In order to arrive at a criterion of the structure of let it 

00 

be assumed that 21|r/c| is convergent. It follows from (116) that 

A:==l 

in such a case W (x) for all x has a derivative TF' (x), that this will 
be obtained by summing the derivatives of the terms in the right 
member of (116), and that W' {x) will be bounded. We thus ob- 
tain the following corollary to theorem 5. 

Corollary, Let {^(0} le a statm7ary process ivith autoeoyrelatioyi 
coefficients ru such that S | r/: | is convergent. Then W {x) will le absolut- 
ely continuous^ 


The derivative lF^(a;) is hounded in modulus,, and given by 
(121) Wfx) = S n cos Tcxf 0 ^x<ft, 

]C=—0X3 

As a first application of the above analysis, we shall touch upon 
some problems concerning the relation between continuous and 
discrete stationary processes. The question to be put corresponds 
to a problem dealt with by G, Elfving- (1937) in a study on Mar- 
koff chains. 
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A continuous stationary process, say g*ives a hypothetical 

scheme for the probability relations in any time points by means 
of a set of distribution functions F(i^, . tn] iii, . Un) satis- 
fying* (53) — (55) and referring to quite arbitrary time points tn. 

Among these distribution functions, let those referring to integral 
time points be represented by It is plain that the set 

thus obtained will define a discrete stationary process, say 
The situation may be described by saying that the hypothesis 
is consistent with the hypothesis 
Marking the symbols referring to consistent processes and 
by (e) and (d) respectively, let have a finite dispersion. 
Then, evidently, we have for any integral Ic (cf. (114)) 

_ ■R(c)(^). 

Further, for any x in the interval (0, ^), we have 

(122) W(x) = n • i [Vin ^27t + x)- Vin • 27r - a?)]. 

71=0 

In fact, inserting the right member of (122) in (115), and paying 
regard to (114), we obtain by elementary transformations for inte- 
gral /^-values 

^{d) ^ j /^x -H d[V(n • 2 + x) — V(n ‘ 2 tv — x)] == (k). 

0 71—0 

On the other hand, let a discrete process be defined by 

a set of distribution functions referring to integral time 

points. Our question is whether there exists an » interpolating » con- 
tinuous process defined by a set where the distribu- 

tion functions referring to integral time points are identical with the 
given set {F^^^}. Studying in the first place the autocorrelation 
1 ^ 

coefficients = ~ J cos lex dW {x\ we seek a continuous stationary 
process with correlation function B^^^ {u) such that, for 

integral Fvalues, B^^^ (fc) = Taking Vix)=~W (x) for 0:<x^ tv, 

TV 

and F(ic)==l for a; ^ tt, such a function is evidently yielded by 

(123) B^^Ku) = J Qo^ux' dV{x). 

0 
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It should be observed that (123) remains unchanged for integral 
^-values when substituting, for instance, V(x — n‘2 7t) for V (x), 
letting n denote a positive integer. Thus, the interpolating func- 
tion will by no means be uniquely determined by the auto- 

correlation coefficients ^iven. This indeterminateness is, of 
course, analogous to the circumstance mentioned in section 4 that 
there exists an infinite number of simple harmonics all of which 
pass through all of the values x(4) taken on in equidistant points 
tn by a simple harmonic xii). 

In case the process considered is normal, the Khintchine- 

Kolmoooeoff device for constructing a normal continuous process 
may be applied on the basis of an interpolating as given, 

for instance, by (123). Furnishing the normal continuous process 
with the same mean and the same dispersion as the resulting 
process wiU possess the property desired, i. e. give rise to the prim- 
ary discrete process when considering the probability rela- 

tions in integral time points. 

We conclude from the above that any theorem on the continuous 
stationary process also holds, mutatis mutandis, for the discrete 
process. On the other hand, since there possibly does not exist 
any continuous process interpolating an arbitrarily given discrete 
process, a theorem concerning the latter may be vacuous and triv- 
ial when applied to continuous processes. 



In Ml agreement with the general results, we find that for integral /c-values — 

On the other hand, in case a discrete process is given, and has W (x) as defined 
by (124) for generating function, one of the interpolating correlation functions will 
he obtained hy taking where for — oo oo is given by (125). 
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The corresiioncling function Yix) is, of course, equal to — TF (x) in (O, 7t) and equal 

7t 

to 1 in 00 ). 

TLe second application of formula (115) is concerned with the 
periodogram method for graduating a sample series section connected 
with a stationary process. Only discrete processes will be considered; 
the arguments also hold, however, in the continuous case. 

First, an investigation will be made as to whether a Schuster 
periodogram analysis of a sample series section, say 
may be expected to be effective, in particular for large values. 
It will turn out that if the generating function of the process has 
a non-vanishing saltus component, the periodogram analysis will 
give positive results, viz. in the sense that the expectance of certain 
well-defined periodogram ordinates will be positive. On the other 
hand, in case the corollary of theorem 5 applies, a periodogram 
analysis will prove resultless. 

Let 1 , 2 , • . .) stand for a sample series belonging to a 

discrete stationary process {^{t)} with dispersion u, mean m, and 
generating function W{x). 

Applying the classical Schuster periodogram analysis to the 
sample series section ^t- 2 , . let the resulting periodo- 

gram functions be denoted by (see (26) and (27)) 


2 

A {n, yl) == ” S (^f.+p — f}%) cos Ip ; 


B {n, /I) = - S (^ 1 ^ 0 + sin ylg ; 
n q=i 


(126) {n, X) = {77; X) + (^^, X\ 

where tQ = t — Studying in the first place the expi-ession 

E==limE[C\77,X)], 


elementary transformations yield 

(127) E[CHn, 2)] = 

^ n n 

==-32 S (cos ‘ cos)lg-h sin/l_p-sinAg) jE'[( 5 (^o + p)-“m) + 

fl g==l 


A n n 4 2 n— 1 ^ . 

■ S S r|p_g| COS g) = — S 


\h 


71 p=i q=zi 
4^2r n-~l 


7% Jc= — n+'l 


Tk cos X h ■■ 


7% 


1^1 -f 2^S ^1 -- ~ j Tk cos X . 
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In case the process is non-aiitocorrelated, we have Vk == 0 for k > 0. The 

above formula then reduces to the Schustee formula (37). 

In a study on sampling problems in intercorrelated series, E. Slutsky (1934) 
investigates, i. a., expectations of type A)] for il-values equalling multiples 

of 27 tln, i. e. expectations connected with the Eoueieb coefficients of a sample se- 
ries section (^t~h • • ^t—n) (cf. (25)). Under certain restrictive conditions concerning 

the process considered, Slutsky gives the relations (127). 

J. Baetels (1935) seems to be the first to have deduced the relations (127) without 
setting restrictive conditions on the process analysed. 

In order to avoid discussions which might obscure the point of 
the analysis, we shall now introduce a restriction concerning the 
generating function of the autocorrelation coefficients, viz. that 
^(3) _ S I I is convergent. According to (121), the 

latter assumption implies that dW^^\X)/ dl is finite for all X. 

It is seen that the limit expectation E may be split up into 
portions corresponding to (119), say • E^'^^ emd • E^‘^\ 

By the simplifying hypotheses made, we have — 0. 

Considering the relations (127), we observe that an elementary 
transformation gives 

n— 1 / k\ 1 

(128) 1 + 2S 1~- rA:Cos2^;=:~- S Sr^^cosl^, 

7 w=i \ nl n s=o 

and that the right member is a Cesaeo mean, viz. the arithmetical 
mean of the first n partial sums Sr^ cosXi of the series appearing 

— s 

in (121). After substitution of 4“ for ?% tbe expression (128) thus 
will tend to d dX a.s n-^as. As by hypothesis this derivative 

is bounded in modulus, we conclude that for all X in (0, tt) 

(129) _ga) _ 4^ . ^ .,, T . 4 - 0 ( 1 /,^) 0 as w -» co. 

^ ' n dX 

Paying' regard to (120), a similar argument gives 

[ = lim [1 + 22 (l — rT cos i = 0 for /I 4= Xv, 

(130) I ^ ^ ^ 

[ == cl • for X = lv< 7t. 

Thus, while an ordinate Xv) in the periodogram of a sample 

series may vary from one series to another, its expectation is by a 
simple limit relation connected with the saltus, | cl^ in X = Xv of 
the generating function TV(X) of the autocorrelation coefficients. 
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It is seen from (129)— (130) that, under the assumptions made, it is 
only if > 0 that a periodogram analysis of a sample series wiU 
be fruitful. Assuming that there are .9 discontinuities in 
the analysis will result in a composed harmonic, 

s ^ 

Xnit — ^) = S A [n, Xv) cos (w + 1 — ® + S J? {n, X^) sin {n 1 — h) X^^ 

V=1 V=1 

approximating the section i, . ^t-A) analysed, and with coeffi- 

cients depending on the sample series considered and satisfying 

lim -,JE[A\n, K) + X,)]=-cl. 

71—^00 (7 

A standard measure of the deviation -- Xnit — Jc) is given by 
the expectation JE defined by 

xnit- A)]^l • 

In fc=i J 

Disregarding terms of order 1/n, the coefficients A and B in 
Xnit—Jc) will make B a minimum (cf. (43)), 

- I c? I ^ (tc)] • ffl 

This relation shows clearly the scope of the periodogram analysis. 
In fact, in the special case Wix)^ the approximating func- 

tion Xnit — Jc) wiU, for sufficiently large ^-values, yield a fit as close 
as desired. 

On the other hand, it follows from the previous analysis that in 
case S|n| is convergent, then W(x)^W^^Aoo\ and dW(x)/dx is 
bounded. In this case, a periodogram analysis of the section 
^t— 2 , • . .) will be resultless (cf. (129)). Processes of this type 
have to be attacked by the use of entirely different methods. 
How, a method at once suggesting itself is that of linear regres- 
sion analysis. For instance, approximating ^{t) linearly by means 
of ^(^— 1) we obtain — denoting by rjit) a 

residual variable with variance • (1 — ri). Otherwise expressed, 
1 yldds an effective prognosis of with a squared dispersion 
equalling E • (1 — r?). This simple instance is 

sufficient for exemplifying that this method is of quite another 
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type than the periodog*ram analysis. The next stage of the present 
study will be to follow up this line of research; this is done in 
section 19. The coming section is reserved for a preparatory survey. 

For illustrations of periodogram analysis of model series, reference 
is given to section 25. For the present we only remark that the 
general formula (127) gives 

(131) - f E[CHn,X)]dX^—- 

7L 0 n 

Comparing with (37), the expectance of the ordinates in a periodo- 
gram is seen to be a function of I with mean value equalling the 
constant (in respect of t) expectance in the case of a purely random 
process. 

It is rather interesting to compare the relations (131) and (130). 
It is seen that (131) also holds in the case of a process of hidden 
periodicities, and that in such a process the ordinates Iv)] 

will not tend to zero as >gd. We conclude that the rise to a 
maximum in UlC^in; A)] will be very rapid, and that the corre- 
sponding peaks in the periodogram wiE be very thin, with a 
breadth of an order of magnitude not surpassing 1/n. 

On the other hand, if the corollary to theorem 5 applies, the 
expectance E[C^(n; X)] endently tends to zero uniformly in X as 

— > CO, 

E[C\n; + 0(l/«). 

n 

In full agreement with (121) we conclude from (131) that in the 
present case — / W' {X)dX=^l. 

7C 0 


18. On linear approximation in a space of random variables. 

The next section presenting a generalization of the regression 
analysis of ordinary random variables to the general discrete sta- 
tionary process, the present section is reserved for an interpretation 
of the ordinary regression analysis as a linear approximation in a 
metrical space. The interpretation applies, of course, to statistical 
as well as to aleatory random variables. 

When dealing in the following with a set of random variables, 
say I, we shall tacitly assume that any finite sub-group, say 
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forms a well-defined multi-dimensional variable, say witb distribu- 
tion function iO, and that the functions F are consistent 

(cf. p. 32 f.). As observed by M. Frechet ((1937), p. 205 1), a set of 
one-dimensional random variables with finite dispersion can be made 
metrical by defining the distance between two variables in the set, 
say and as the dispersion of the difference variable 

(132) 

This fact depends on the simple inequality 

(133) D B ^ 

which by elementary transformations reduces to the inequality of 
Schwarz. 

Adopting the distance definition (132), next let a stand for a real 
number, and let ^ and r} be two random variables. Then we have 

(134) {rj\ 

This squared distance will reach a minimum equalling 

(135) 

when a equals the regression coefficient of ^ on i. e. when 

(136) a = r{^, rj)- I){^/B{rj), 

This regression coefficient is linear in respect to In fact, 


(137) 




E[[^-E[l] -t g-Jgg]) [v-E[ri\)] 
Birp 


E[{1 -Em) {rj - E[rj])] + E[{^-E[Q) {rj --- E [rj])] 
B (rj) 




Using these properties of regression coefficients, the multiple re- 
gression theory founded and developed by the English statistical 
school (see G. U. Yule and M. G. Kendall (1937), p. 511, for re- 
ferences) can be interpreted as a particular branch of the theory of ap- 
proximations in general linear spaces. In the general terminology, 
uncorrelated variables and should be called orthogonal ele- 
ments in the space to which they belong. For later application we 
shall record some general approximation formulae in terms of de- 
terminants. The verification being in detail parallel to the Gram- 
ScHMiDT orthogonalization of vectors in an infinite number of dimen- 
sions, reference is again made to G. Kowalewski ((1909), § 175). 



n 18] ON LINEAR APPROXIMATION IN A SPACE OP RANDOM VARIABLES 77 


Let an M-dimensional variable formed by variables 

with finite dispersion be linearly non-singular (see p. 41 f.). Consider- 
ing an arbitrary variable with finite dispersion, and writing 
Ml/ = — mi) — wii-)], there exists a well-defined 

sequence of coefficients atn minimizing the dispersion of the variable 
defined for arbitrary a/»'s by 

(138) — am - — nij) a„„ ■ (^ri) _ 


Terming residual, and denoting by the variable (138) formed 
by the minimizing coefficients atn, we have = 0, and 


(139) rf' 



• -5 


— 


• •) 


-mn 

(WOl, 


^(0) , 

- w?o 



!^ln 

i 

! 


. .. 

1 f^nn 






Mill • • 

, Mi?J'5 M30 


I'^n 1 ) • • 

) M»71j MliO 


flni , . . 

, fX()n, f^OO 


Mil, 

• ‘t i^ln 



M??i, 




A set of formulae equivalent to (139), and involving an auxiliary 
set of one-dimensional aleatory variables ^(‘1, is given by 

(140) r]^^) = — mo — Cj - • C'”> , 

(141) Z>^ (7?“) = (§®>) -el -cl el. 

For the auxiliary variables we have (cf. G. Kowalewski (1909) 
p. 426 and T. Lindblad (1937)), 

I ^11. — "h 

(142) = 

^ 1 • i“ll 1 / „ ^^11’ ^12 


1^211 1 ^ 22 ’ ^^2 

^33, 

.. I I Mm Mi 2 ) Mis I 


Mil 5 Mi 2 
M 21 J M 22 


M21jM22)M23 

Ms1»Ms2)M3S 
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These variables are standardised^ i. e. I 

(143) = DZ^^] = 1. 

They are further mutually non-correlated, 

(144) r = E = 0 for i^lc, 

and it is thanks to this relation that the coefficients Ci in (140) are 
independent of 

(145) c, = r C"') • D - m,) • 

On the other hand, in case is sing*ular, say of rank 

h, and on account of the relations (69), the coefficients atn in the 
residual variable will not be uniquely determined. In fact, the r 
addition of an arbitrary linear combination of the vanishing sums 
(69) will not change the variable (138). In full agreement herewith, 
the expressions (139) become indeterminate when . ., is taken 
to be singular. 

Whether or not be singular, the residual variable 

will be uncorrelated with every In fact, formulae (134) — (136) 
imply that otherwise the variance could be brought down 

by subtraction in of the non- vanishing variable 

' ■ II 

Considering the residuals . . ., it follows from (141) 

that the variances . . ., form a non-increasing 

sequence. It is of central importance that, when B^ > 0, we 
shall have B^ — B^ if, and only if , one of the following two 
cases is present: 

(A) is singular by means of a relation of type 

(146) — bpi- — mi) = — 7ni+i 

(B) No relation of type (146) exists, but is uncorrelated 

with 

It follows, i. a., that if is singular of rank the 

variables may be arranged in an order such that the coefficients 
ap q in the residuals ^ 7]^^\ . . ., 7 ]^^\ and only in these, are uniquely 
determined, while 
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(147) D > B > D hf)) = 

= D (7?^^+!)) D > 0 . 

In other words, the first h variables alone will be able to 
bring down the residnal dispersion to its minimum value. 

The following determinant expression for a general partial correla- 
tion coefficient in the notation of G. TJ. Yule (see G. U. Yule 
and M. G. Kendall (1937), p. 269) will attach the above system 
of formulae to the familiar theory of multiple correlation. The 
formula is valid in case none of the variables 
and . . ., 1^^^] is singular. 



A proof from ^ to ^ + 1 will verify 

J92 ~ . (][ — ^,2^) Q _ ^,2^^ 1) • • • (1 — ro??, 12 • . . (n—1)), 

a formula which shows, i. a., that also the partial correlation 
coefficients lie in the interval ( — 1, 1). 

For later application, we record that 

f t A o\ 23 • • • 71 

il 4 oj ain — ^"01, 23 * * . w • Vi , 

-L/1, 23 • • • n 

and analogous formulae hold for the remaining coefficients at n 
(cf. G. IT. Yule and M. G. Kendall (1937), p. 266). Here D*,23. . w 
for i = 0 or 1 represents the dispersion of the residual variable' 
obtained when approximating by the variables 

For application in forecast problems, let us interpi'et (138) in 
terms of conditioned expectations. Writing {G) = (?^^^ == ^ 1 , . . 

an estimate of which is the best linear one in the 
sense of principle of the least squares, is yielded by 
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(149) Fa — W?o + ai n (?1 — ^l) + - • • + ann . 

We have 

and 

E - Fc = E = D" 

where the latter relation results from (57). 

If the variables are mutually uncorrelated, the coeffi- 

cients aa are independent of Tc, Writing Uih = at, and taking 
((7) = ^ we get in this case j 

I 

(150) Fa [^''1 == ~ +•■• + «/, - mj), 

Further, if is independent of the variables . . ., 
we have 

(151) Ec[v^'^^] = E[rj^^^]^0, 
and 

(152) Ec = mo + am (^i — + • • • + am i^n — mn) + 


19. Linear autoregression analysis of the discrete stationary process. 

In this section, the linear regression analysis as surveyed in the 
previous section will be applied to the variables ^ (f) connected with 
a stationary process ®}. Approximating ^ (t) by means of 
— 1), . ^ {t — nX a well-defined procedure of consecutive approxima- 
tions will be given. After a passage to the limit, we shall arrive 
at a residual variable with properties corresponding to the case 
of a finite number of approximations. 

Let be a stationary process with finite dispersion o', with 

mean m, and with principal correlation determinants ^(r, n) given 
by (81). Approximating ^(8 by means of 1,.., t — n)= 

= - 1), . as described in the previous section, let the 

residual variable be given by 
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(153) r] it; ® m - a(l, ^ 1) - m] - a (2, w) • 

• [^ (^ — ' 2) — m] — — a {n, n) -[^(t — fi) — m]. 

In case J (r, n — 1) 4= 0, formnla (139) yields 

(154) (^ it)) > [rj it; n)) == • z/(r, n) ! J ir, n — 1) > 0. 

Since B^ (^?(<^;^2)), = 1, 2, . . forms a non-increasing sequence (cf. 

(147)), we get 


(155) 


1 ^ ^5% l) ^ ^(r, 2) ^ 
1 ^(r, 1)^ 


z/ (r, ^ 2 ) 

J (?% ^2 — 1) 




According to the analysis in section 14, one of two cases is 
present. Either z/ (r, n) is above zero for all 22 , or z/ ir, n) >0 
for n < h while z/ ( 2 ^, n) = 0 for n'^h. The number h appearing in 
the latter case equals the rank of linear singularity of the process 
considered. Paying regard to these facts, and to the relation ( 155), 
it is evident that any stationary process belongs to one, and only 
to one, of the following classes: 

(I) . The process is non-singular, and there exists a positive constant 
-<1 such that 

(156) B^ [t] it; 22 )) / cr^ = z/ (r, 22 ) / z/ (r, n — 1) :< 1 as 22 co . 

(II) . The process is singular, say of rank h, 

(III) . The process presents no singularity of finite rank, but 

(157) B^ {rj if; 22 )) / = z/( 2 -, 22 ) / z/ (r, 22 — • 1) 0 as n-~^ <x). 

In this case, the process will be termed singular of infinite ranh. 
In the following analysis, we shall assume that the process 
considered belongs to the first class. 

Autoregression analysis of a non-singular pi'ocess. According to 
(140), the variables (153) may be written on the form 

(158) ri it; n)^^it)-m-c, (0 - c, ^lit) it ) , 

where the standardized variables it) are given by (cf. (142)) 


(159) 


1 r„_2, — — m 

5 1 j • M — 3j ^ 2 ) 


Tn-h rn-% - 1 , g (^ — W) — OT 

a • y J (?*, n — \)- J (r, n — 2) 


6 — 38387 . H. Wold. 


ANALYSIS OP STATIONARY TIME SERIES 


82 


[1119 


and where the coefficients Ci are uniquely determined, and inde- 
pendent of n (cf. (145)), 

(160) 

According* to the general analysis in section 13, the variables 
will, for an arbitrarily fixed 7^, constitute a stationary process 
{§SJ(0}. It is seen that the variable ^l[t] defined by this process 
will be of type (62). Similarly, for any n the variables rj {t; n) will 
constitute a stationary process of the same type (62). It 

will next be shown that the sequence [rj (t; n)] is convergent in 
probability as ^ co . 

According to (141) and (156) we have in the first place 


D“ {rj {t ; n)) = — cl as 

We conclude that 2 cl is convergent, and that 

(161) c? + C'2 + • * • = (1 — 

Next, the variables ^ (0, ^ (0, . ■ * being uncorrelated (cf. (144)), we get 

n 4* y?) — iq if; n)) = cl+i + ••• -{- cl+p . 

Keeping in mind that S c| is convergent, it follows that this disper- 
sion tends to zero uniformly in p as n—>^. Hence, paying regard 
to a remark in section ' 13 (see p. 40 f.), we conclude that the 
sequence p it; 1), rj it; 2), .... is convergent in probability. According 
to theorem 1, this implies that the sequence 

(162) 

will also converge in probability. 

The limit process of the sequence (162) will be denoted {rjit)}. 
In analogy to the case of a finite number of approximations, the 
process {rjit)} and the corresponding variables 7]it^, . ., tn) will be 
termed residual. According to a remark on p. 41, the mean and 
dispersion of the residual process will be given by the corresponding* 
limit characteristics of the sequence (162). 

Observing ithat ?; (^; n) is n on-correlated with the variables 
^it — 1), . . ^|(^ — n), and keeping in mind that the dispersion of 
r}it;n) lies above a positive constant, it follows readily that the 
limit residual pit) is non correlated with ^it — n) for any positive 
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integer n. Hence we conclude that jj (i) for any /c > 0 is uncor- 
related with all of the variables ^(.t — Jc), — defined by 

(159) (ef. (137)). Further we have 

E[{^ (f) ~m)-rj if)] = lim E[{m - m ~ c, it) ^(0)-7?(e] = 

n— »• CO 


Hence, paying regard to the relations (c£. p. 77) 


we get 


E [ri (©] =- E[r} (t; 9?)] == 0, 
r (e, rj it)) -=D{r) (fi) / H (g it)) = x. 


Writing generally 

(163) r it + n), rj it)) I K = hnj 



we thus have Sq = I? ^-iid in ~ 0 for n < 0, 

The above-mentioned properties of the residuals correspond di- 
rectly with the finite case dealt with in the previous section. 
There is also another important analogy. Considering in the finite 
case the residuals obtained when approximating and by 
and respectively, a short reflection shows 

that these residuals are non-correlated. In order to show, corre- 
spondingly, that the residuals rjit) and rjit—p) are non-correlated, 
let it be observed that 

E [t] it) • rj it = lim E [rj it; k) • it -- p) — m — Ci ^ it — p) — * ■ • — 

k-*co 

— Gk-^p+i it—p) Ck it—p)]] . 

Denoting by Qk--p the sum appearing in the second row, we have 
E[Qk~-p] = 0, and D^iQk—p)—*0 for any y? > 0 as Thus 

iv ^ — m — Ci^iit —p) 

k — *00 

— Gk^plk-pit—p))^ 

According to previous remarks, the correlation coefficient in the 
right member equals zero for any p > 0. Observing that 

(1 64) r [rj it), rj it + p)) = r [r] it — p), iq it)] == r {rj it), rj it — p)) = Q 

we conclude that the process I?;®} is non-autocorrelated. 

Summing up the results, we get the following theorem. 
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Theorem 6. A residual process {rjit)] obtained from a no7i-singtdar 
stationary process {^(0} is stationary and non-autocorr elated. The 
variable rj(f) is non-correlated with lit — 1), 2), . . ivhile 

r{lit\r]it)) = D{7]it))ID{lit)). 

The arguments used in the proof of this theorem also apply in 
the remaining cases II and III. As the residual variables rj it) are 
here seen to be vanishing, their correlation properties will be 
indeterminate. Accordingly, these cases need no further comment. 

Illustrations of the autoregression analysis of stationary processes 
will be given in sections 25 and 26 (cf. also p. 92). 


20. A canonical form of the discrete stationary process. 

In this section it will be shown that the residual processes arrived 
at in the preceding section give a basis for the construction of a 
canonical form of the stationary process with finite dispersion. 

Until further notice, the random variables considered will be 
assumed to have a vanishing mean. 

In the case of a finite number of approximations dealt with in 
section 18, the following representation holds for the variable 
subjected to the regression analysis (see formula (138)) 

(165) + a, + an 

Considering, on the other hand, the residuals t] it; h) defined by 
(153) and obtained from a stationary process {£ it may be (cf. (148)) 
that the minimizing coefficients a ii^ Tc) are bounded in modulus by 
a constant not surpassing 1 /jc. In such a case, a diagonal selection 
procedure will show that there exist a real sequence a^, . . . and 
a sequence of integers /cg, . . . such that for all i 

lim a (^, Ic^ = tti, I ] < 1 /% . 

s— ♦ 00 

Hence, we are led to ask if for these coefficients Oi 
lim [i; (0 + % — 1) + • ■ • 4- fln ? (i — w)] 

n — ►CO 
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exists and equals |(f), a relation which, would correspond to (165). 
If the answer is affirmative, a non-singular stationary process {§(©} 
could always be written on the form 

( 166 ) {^( 6 } + 1 )} + 2 )} + . . . 

where {r] (^)} is non-auto correlated, and r] {t) is uncorrelated with 
^(^^---1), etc. 

However, as will be seen in section 26, certain conditions would 
have to be imposed upon {J(C} in order to secure the representation 
(166). For the present we shall leave open all questions in this 
matter, and proceed to an aspect of the finite case suggesting 
another canonical form of the stationary process. 

Writing 

it is evident that the sequence {t; k)} is convergent as i . 
Denoting the limit process by {^(0}, we obtain 

{?(«} = {^(e} -f 

Since 'd'{f;k) is a linear expression in 1), . ^{t — k), and thus 

uncorrelated with rjit + n) for all ^ ^ 0, it follows that r} (0 is 
uncorrelated with the variables 'd'it — 1), etc. So far there is 
a complete analogy with the finite relation (165). In further ana- 
logy, d'{t;k) can be written as a sum involving the uncorrelated 
residuals rjit — 1 ; ^ 1 ), . ., t;? ^ + 1 ; 1), and r]{t~- k; 0) = ^ {t—k). 

This circumstance suggests the question of whether (©} is a linear 
expression in [riit — D], {rjii — 2)], etc. Were the answer in the 
affirmative, then {§(0} could always be written on the form {7^(6} + 
-f {rj {t ■— 1)} -{■ \ {ri(i — 2)} + , . . ., where {rj (©} is non-autocorre- 
lated. It will be found that such a sum will not be sufficient as a 
canonical form for the stationary process — in general, a singular 
process {'ip{f)}, which is uncorrelated with {rjif)}, has to be added 
in order to obtain {^(0}. 

After these introductory remarks, let {§(©} represent an arbitrary 
non-singular stationary process with finite dispersion u, and with 
zero for mean value. In the first place, let an approximation 
procedure be performed on §{t) by means of the residuals r]{t;k)^.,y 
r](f — n;k) given by (158). Let the new residuals be denoted 
i) and written 


86 ANALYSIS 0 ¥ STATIONARY TmE SERIES [1120 

(167) ip it; n; ]c) = ^(t) — b (0; n; Jc)-r](t;Jc) b (n; n; Jc)- rj it’— n; h). 

Since ^ii) is non-singular, the coefficients b minimizing JD {'ip {t; n; ]c)) 
will be uniquely determined. 

The processes {'ip (t; n; ]c)\ defined by (167) are, like of 

type (61). The processes {'ip{t;n;k)} will next be subjected to a 
repeated passage to the limit, first in respect of 7c, and then in 
respect of n. 

Letting Jc tend to infinity, and paying regard to the relation 
lim r [t] Jc), rjit— q; Jc)) = 0, 

fc — *-00 

wMcli according to theorem 6 is valid for p ^ q, we get (cf. (163)) 

(168) lim i (p;n; lc) — r[l it), pit — p)) / x = Ip, 

k — ►CO 

independently of n. Thus, keeping in mind that the variable 
n; Jc) tends to rjitfi— 1, . ,,t-~n) £is Jc-^ , it follows 
that for all n and t 

(169) lim \p {t; n; Jc) == ^(t) — rj it) — bj^r] {t — 1) bn 7] if — n). 

k — 'CO 

Let the limit variables thus obtained be denoted by xpit;n). Now, 
holding n fixed, the variables %pit;n) will obviously constitute a 
stationary process {xpit;n)} which is the limit of the sequence 
{fit;?!;!)}, . . 

Keeping in mind that the variables t] it) are mutually uncorrelated, 
a short calculation shows that 

(no) I)^xfJit;^)) = [l-a + hl + ---+i^).x^]aK 

Concluding that the series 211 is convergent, let us write 

(171) E^ = l + h\ + hl + 
and further 

(172) ^(t;n) = -t]if) + ij^-pit-l) + --- + bn-pit-n). 

Thus prepared, let n tend to infinity in (169). Since (S&l con- 
verges, we have 

D^C(t;n+p}-iit;n))^(bl+i + --- + il+p)-L^ip)-^0 
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uniformly in p as ^ co . It follows that the sum t] {t). + rj {t — 1) -V 
+ ^2 ^ — 2) -f • • • is convergent. Denoting the sum variable by 

we have 

(173) c (e = l™ C it; n) = (e 4- ^ (f 1) + ^ 2) + • • ■ 

n — ►CO 



Further, the variables ^ it) and rp (f) = lim xp (t; w) constitute two 
stationary processes (S} = lim and {?//(©}= lim {i/;(^;ra)} 

n — *00 n — >00 

respectively. 

Observing that (173) yields 

(174) = (1 + ^ (7^ - . (I^ 

two cases may be distinguished: 

(A) X ^ K=l. Then D {xp (t)) = 0, and 

(175) {|(e} = {C(e}; 

(B) X ' K < 1. Then D [xp (t)) > 0, and 

(176) {^(e} = {C«)} + {Xpit)}. 

Advancing that {ipit)} is singular, it is seen that (176) covers 
both (175) and the cases II and III (see p. 81). Moreover, giving 
xp(f) the same mean as §(6, the representation (176) evidently holds 
also in case {^(6} has a non-vanishing mean. 

Formula (176) is the desired canonical form for a stationary 
process with finite dispersion. As already pointed out, the variable 
^ (f) corresponds directly with the case of a variable ^ in a finite 
number of dimensions. Further, according to (164) and (173), our 
presents a certain similarity to the general process {y(ft} of 
linear regression as introduced in section 15 y. However, {^(G} is 
still more general, for the variables 7] (t) constituting ^ (t) are non- 
correlated, while those forming y{t) moreover are independent. 

Some characteristic properties of the variables ^ it) and ip if) 
appearing in the canonical formula (176) will now be proved, and 
the main results then comprehended in a theorem. 

Observing that 

(1 • biY < (I &;+z) • (f &l) = • 1 61 0 as 2? CO , 

2==0 i =0 i —0 p 

oo 

we conclude that 2 h • h+p is convergent for all p, and that 
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(177) rp© = -EK(e-C(^-2))]/>£'--S['-<T^ = 

= lira (Jbp + hp+i ■\ + h 6n ■ —*0 as p — » « . 

n — *00 

Paying regard to (163), we get further 

(178) TpQ^ = (bp + hp+i • hi + ip+2 * ^2 +•*••)/ = 

= • lim r (^ ®, rj (t --p) -i- bi^-rj it —p — 1) + • • -Vhn' rj it—p—n)) = 

X A 71— .00 

= r(?aUtf-i9))/x-X 

Considering in the second place we obtain from (170) 

(179) D' (©) = (1 - • cy^ 

l^ext, in case Di%p)>0 we get from (169) 

r {'ipit;n); rj (^ + p))==r (§ it)'—rjit)—hi • (f— 1) — • — • rj it~n)] rj it+p)). 

Keeping in mind that rj it) is nncorrelated with any ^ it — p) and 
with any rjit±p) for p 4= 0, and paying regard to (163), a short 
calculation shows that r [xp it; n), rjit + p)) = 0 for p > n. It 
follows that for any p§0 

r {xp it), 7]it + p)) = lim r {xp it; n), 7]it + p)) — 0. 

n — *00 

Hence the fundamental relation 

(180) r {xp it), ^it -{■ p)) = lim r {xp it), ^it p; n)) = 0, P^O, 

which shows that the processes {^(0} and {^(6} are non-correlated. 
Thus we have (cf. (59) and (119)) 

(181) n ® = ■ n (3p) + ■ n ® • 

Por the preparation of the remaining proof of the singularity of 
{ip it)}, let it be observed that the non-correlation between rj it) and 
rj it + p) for p 4= 0 implies that for any real number 

(^ it; n) + Ui ' ^ it — 1 ; n)] = [rj it) + iai + h^‘ Tjit ~ 1) 4 

-\-iai • hi + h^ • r}it—2)-\-iai •.& 2 + ^ 3 ) * “^(^ — 3)+ * " + • h^ - rjit — 7 ?--l))= 

== D® { 7 ] it) Bi * 7] it — 1) -i {- B 71+1 • 7] it—n — D) = 

(1 4 HI + • • + BUi) • (J' ^ • cy^ 
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the auxiliary constants Sm introduced being real. By the same 
argument we conclude that for any real ap 

(182) DMC ® ^ - 1) + • ' • + ^ a - i^)) = 

= lim {t; rij) + a^' — l]nj) it —p; nj)) > %- • crl 

J— 00 

Considering now the variable ^ (6 + % • — 1) + • • 4- % • ^ {t—p), 

and paying regard to (176) and (180), a short calculation shows that 

(183) (^ (0 + «! • g (^ ~ 1) 4" • • a-p' lit— p)) = 

= (t// (0 4- + • • 4' % • 7 // {t —p)) + 

4- (^ (0 4~ ^ — 1) 4 • • 4- C — p )) . 

However, an ^ > 0 being arbitrarily given, we know from (156) that 
there exists a number p (s) and a real sequence, say a*, aj, . ., ap, 
such that the left member of (183) is less than 3c^ • 4 e. On the 

other hand, (182) shows that the second variance in the right 
member of (183) is not below Hence it follows that 

{ip (6 4 * 7/; — 1) H }r Up - Ip it-— p)) ^ e. 

Since £ is arbitrary, this relation implies that {xpif}} is singular of 
a finite or infinite rank. 

Summing up, we have the following theorem in which one of 
the variables {xp{t)} and {^(6} may be vanishing. 

Theorem 7. Denoting hy {§®} an arbitrary discrete stationary 
process with finite dispersion, there exists a three-dimensional stationary 
process {'ipit), ^(t), 7](t)} with the folloteing properties: 

(^) {g®} = {7/'®}4{^®}. 

(B) {'ipit)} and {^(0} are non-correlated, 

(C) (TpCt)} is singular. 

(D) {??(0} is non-autocorrelated. 

{E) (C it)} = {ri ®} 4 \ • {tj it - 1)} ^h-{rj(t-3)}^‘- 

ivhere In represent real numhm's such that h ll is convergent. 

IlUistrations. In order to illustrate the autoregression analysis, let us consider 
a normal stationary process {^(^)} as defined by the characteristic function (ill). 
Assuming for the sake of formal simplicity that w = 0, and that (X = 1, we shall 
first investigate a sum variable ^ [^] of type (62). 

Writing ^(t, 1, t — ^^) = [^ (t), — fi{t — %)], we have by de- 

finition 
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^ (t) = ^ (t) ^(t— 1) +■•••+ a/i § /^) 

^ (t — 1) = ^ — 1) + % ^ (f — 2) + ■ ‘ + a/i §(t — — 1) 


{^(f — n) — ^{t — ^^j + Gi — n — 1) + • • • + ah ^(t — n — li). 

According to the introductory remarks in section 14, the characteristic function of 

1, t — n\ say fniZi, Zt—i,..y Zt-n) = e will for all ^i§Abe 

obtained from the characteristic function fn-\-hiZt, Xt—i , . Xt—n—h) hy the sub- 
stitution 

Xt~i = a^Zti- Zt-i 


Xt—h = Gh Zt-\- Gh—l Zt^i -f • • • + Zt^h 
Xt-^h—i = ah Zt~i + • ■ ■ 4- Zt—h -i- Zt—h—i 


(184) \ 


Xt-^n == Gh Zt—n+h + • • • + Zt^n+1 + Zt-n 

Xt—n—1 — an Zt—n+h—1 4 * * 4 ^2 Zt—n-\-l 4 a^ Zt—n 


Xt—n—h-^-l — Gh Zt—n+1 4 Gh—l Zt—n 

^ Xt — n — h Gh Zt—n 


We conclude that the distribution of t — 1, . t — n) is normal, and that we 
obtain Q% from Qn-^h by the substitution (184). 

According to the substitution theory of quadratic forms, the matrix defining Qn 
is obtained as follows (see e. g. G. Kowalewski (1909), § 94). Writing An (^ for 
the matrix of Qn, and Bh, n for the matrix of the substitution (184), we have 




1 , 


^2 

• • j 

rn^-h 




1 


• • j 

Tn^h—l 

^n+h (^ “ 

— 

^2 / 


1 


2 



^ rn+h? 

rn+h^h 


• 3 

1 



, 0 

0 , 

. . 0 , 

0, 

. 0 


1 , 

> 0 , 

■ . 0 , 

0, 

. ., 0 


ttfc 1 

. Gh—2, 

. • 1 , 

0, 

. .3 0 

II 

0 

> Gh 

> ah — 1, 

. . ., G^, 

1, 

. ., 0 


0 

, 0 

., 

Gh) Gh — Ij . 


%, 1 


. 

• . 

. . . 

. . . . 


. . . . 


0 , 0 


lo, 0 


, 0, ah> Gh—i 
, 0, 0 , Gh ^ 
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In tlie first place we must form a product matrix consisting of + 1 rows and 

n + 1 columns, and take for the element in the row the inner product of the,/^^ row 
in An+h © fiy the column in Bn,h- Multiplying then Bh,n and the product 
matrix by columns, we arrive at the matrix required. Denoting this by J. 71 ©, 
and by M' the transposed of a matrix If, we have 

An (^) Bji^ n * An-\-h * Bfi^ tj. 


Forming in the same way the infinite matrix 
(185) AQ;)^B' ^A{^^B, 


where 


(186) ^(g) 



, 

r,, fa, . . 




1 , fi, . . 



^2. 

rp 1 , . . 



. 

. 



f 1 , 0 ,0 , 0, 



1 , 

0 , 

0, . . 




a^, 

ai , 

1 , 

0, . . 





a/i—h 

. ah~‘% 

. . ai, 

1 , 

0 , 

0 ,... 

0 , 

ah , 

1 ah — 1j 

ah— 2 , • • V 


1 , 

0 ,... 

0, 

0 , 

ah , 

ah—h . . 

• • -5 

ap 

1 , . . . 


it is readily verified that An © equals the principal minor of order ■+• 1 in ^ (0. We 
conclude that the distribution of ^ [fl is normal, and that A (^) is the matrix of the 
infinite quadratic form {Zt, Zt~h • • •) appearing in the exponent of the character- 
istic function of ^ (jf, ^ — 1, . . .). 

Formally, the above procedure applies even in the case of an infinite sequence 

(a^, ^ 2 , . . .). It is seen that if the double series appearing in the matrix are abso- 

co 

lately convergent, the variable ^ [i(] = S at — i] will be well-defined, and consti- 

i ‘=0 

tute a normal process. The characteristic function of the variable ’Q{t, t — 1, . . .) 
will be given by e i Q*’ 

Next we shall consider a few particular instances. 

Let (1, &!, ^ 2 , . . .) represent a real sequence such that ff ^ = 1 + S bf is finite, 
and let {t] (t)} be a normal and purely random process with vanishing mean, and 
dispersion equalling unity. Considering the variable ^[t]^r][t] -b 7][t — 1] -t- 
7 ] [t — 2]+. . ., it is readily verified that the above substitution procedure 
gives AiQ = B' • A {rj) • B, where 



'100..^ 


0 

0 

0 


[K^ K^r.,K^r„.\ 


0 10.. 


1 , 0 , 0, 0, . . 



A(j]) = 

0 0 1.. 


&2, bp 1 , 0, 0, . . 

, A{0 = 

j'l, ri, . . 










hg } b^y bp 1, 0, . . 




, , 






having in the latter matrix written for ( 61 - + hi ‘ *) / - 

These results are seen to be in full agreement with formula (177), and with the 
correlation properties of a normal stationary process (see p. 62). 
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Considering in the second place a process {^(6} which is singular of rank /j, 
there exists, by definition, a sequence , aj^ such that the relation (77) is 

satisfied. Accordingly, the infinite matrix (p = -B ‘ A {^) • B formed by means 
of the matrices (186), will consist entirely of zeros. For instance, letting g (t) + 
-f ^ (t — 1) = 0 be the relation of singularity, we have li = a^ = l (cf. (107)). A 
short calculation will show that 



1, -1, 1, -1, . 


^1 0 0 0..' 


0 0 0 0..' 


-1, 1, -1, 1, . . 


110 0.. 


0 0 0 0.. 


1 

1 

1 

1 

, -B = 

0 110.. 

0 0 11.. 

11 

Q 

0 0 0 0.. 

0 0 0 0.. 










If {^(f)} is singular of infinite rank, there exists, for every integer n and every 
£ > 0, a number h {€, n) and a sequence at (fi, n) such that the variable ^ [^] + 

+ % ? [^ — 1] + • • + a/i ^ [f — h] will give rise to a matrix Q;n, whose elements 

are all less than s in modulus. 

Proceeding to the operation of summing independent processes, let {^(0} and 
{xfj it)} stand for two independent normal processes. Denoting the sum process by 

{^(t)}, and indicating all symbols referring to the three processes by ^ and Xfj 

respectively, and paying regard to the evident fact that the characteristic function 
of ^ (t, t — 1, . .) is the mathematical product of the characteristic functions of the 
variables ^it,t — 1, . .) and XfJ it, t — 1, . .), we obtain (^) * Qi^ = D^ i^) • §(^ + 
+ D^(i//)*§(^), i.e. 

00 00 00 00 

D^g) • S 2 r[p-Qi ® • Xt^pXt-g = • S 2 (0 * Xt-pXt^g + 

0 0 0 0 

4 " JD^ ilf/) • 2 2 r\p — q\ ixp) • Xt — pXt — q* 
0 0 

In full agreement with (59) we obtain the relation (181), and observe that mutually 
uncorrelated normal processes are always independent. 

The simple types of normal process mentioned above are sufficient to illustrate 
theorem 7. Starting from a purely random normal process {t;) (t)} with suitable 
dispersion, forming a sum process of fypo {'^it)}’^'b-i [rj it — 1)} + 

+ {rjit — 2)} + . . ., and adding an independent normal process {ifJ it)} ruled by 
an appropriate singularity, we shall arrive at an arbitrarily prescribed normal process. 



CHAPTEE III. 


On the theory of some special stationary processes. 

21. On the concept of stochastical difference eq[nation. 

The relation 

(187) {?W} 

arrived at in section 15 S presents a formal analog*y with an 
ordinary, or functional, difference equation 

(188) x(t) -t aj^-cc(t — 1) -i -h a/i* x(t — h) = ^ (tX 

We have found under certain conditions concerning the coefficients 
ai, and in case the process {rjit)} is purely random, that there 
exists a stationary process {^( 6 } which satisfies (187) and is of type 

(189) {?®} = {r](t)} + • {77(f - 1 ) } + - 2)} + . . . 

On the other hand, since under general conditions a solution of 
(188) will be of the form 

(190) = 2 /® -h — 1) + 62 • — 2) 4- 

a clear formal analogy may be seen between (187) and (188) also 
in respect of the solutions. 

Expressing the situation in words, a solution of type (190) of a 
functional difference equation is a moving average performed on 
the function y {t) in the right member. Correspondingly, any sample 
series • • •) connected with a process {f®} of type (189) may 

be looked upon as a moving average of a purely random series 
Vt-h • • • )• 

Because of this parallelism, I propose to call (187) a stochastical 
difference relation between the processes {^®} and {r}{t)}. If {^ 2 ®} 
is known, and {^®} unknown, (187) wiU be termed a stochastical 
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difference equation. An interpretation in the languag-e of the theory 
of oscillatory mechanisms wiU reveal some interesting connexions 
between functional and stochastical difference equations, and exemplify 
the wide applicability of the new concept. 

An oscillatory mechanism presents certain intrinsic features 
relevant to the structure of the movement considered. Studying 
the movement in integral time points, these features are summed 
up in the relation 

(191) x{t) + % ' x{t — 1) + V ah' x{t — }i) = 0. 

Interpreting (191) as an ordinary difference equation, the solutions 
(see section 6) describe how the phenomenon would develop out 
from any initial values, say x{t — 1) == Xt—u . . x{f — h) = Xt—h^ if 
there were no external influence present. 

In the ordinary difference equations, the external factors are 
taken into account by means of the function y(S). Thus, instead 

of the value — % • i — ag • —ah'Xt^h to be expected 

for x{t) when the earlier values are known, the variable in question 

takes on the value y it) — * Xt-i au' Xt—u^ In this approach 

the external influence is dealt with as functional, i. e. uniquely 
determined at any future time point. 

The stochastical approach differs in the allowance for the external 
influence upon the mechanism. Here the external factors are not 
dealt with as functionally determined; they are only assumed to 
be ruled by certain probability laws. These laws constitute the 
stochastical process in general, the probability laws are 

subjected only to the conditions (53) — (54) which express that 
the laws must not contradict themselves (cf. p. 3). The simple 
case investigated in section 15 <5 corresponds to a purely random 
eft'ect of the external factors, but nothing prevents us from ap- 
proximating the external influence by a non-purely-random, or even 
a lion-stationary process {rjit)}. 

Having fixed a any sample series (. . ., rjt—u Vt+i: • • •) 

will describe an actual realization of the external development. 
Since the movement of the mechanism is knowm when the external 
factors are determined, the sample series (. ..,97^—1, 77^, . . .) 

considered will correspond to a certain sample series, say (. . ., 

S+i, • • •), of the process {?(^)}. However, as we possess only 
probability knowledge of the actual path (. . ry, rjt+u . . 0 of 

the external factors, we can reach only probability laws about the 
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behavioiir of tlie oscillatory mechanism. These probability laws 
concerning (. . ., . . .) constitute the process and 

form a solution of the stochastical difference equation. 

By means of the probability laws found for the mechanism, it 
will be possible to give information as to the average behaviour of 
the phenomenon considered, i. e. as to the expectations referring to 
It should be observed that we cannot say in advance that 
the conclusions as to the average behaviour will be identical to 
those drawn from the functional difference equation (191). Ne- 
vertheless, it has often been argued — more or less explicitly — 
that any intrinsic tendency of the mechanism to produce periodic 
oscillations will, on the average, give rise to a corresponding" 
oscillation in the phenomenon when influenced by random shocks 
(see e. g. Sir G. Walker (1931), p. 522, and E. Frisch (1933), p. 202). 
However, the following analysis will show that there are important 
instances when such inference based on analogy is incorrect, 
qualitatively as well as quantitatively. For instance, let G • (f ‘ cos 
(Xi • if + 99 ) represent a solution of (191), i. e. a damped oscillation 
characteristic of the mechanism, and consider the simple case of 
purely random external shocks. Then, even if there is no other 
intrinsic tendency to oscillation present, a periodogram analysis for 
the search of the frequency will be more or less misleading. 
As will be shown in section 25, there is in general a systematic 
deviation between X^ and the abscissa for which the expectation of 
the periodogram ordinate presents a maximum. It may even happen 
that there is no maximum at all in the neighbourhood of li. 

As soon as external influence cannot be considered free from 
random elements, the stochastical difference equation should be 
preferred to the functional equation (cf. the quotations from G. H. 
Yule (1927) in section 10). It is also obvious that the former 
embraces the latter as a special case, for any function y(S) may be 
interpreted as a singular random process. Thus, the stochastical 
difference equations seem to merit particular interest. 

^ When omitting all dispensable conditions as to { 7 ] (J)} and the 
the solutions of the stochastical difference equations become 
of a very general type, and embrace fundamentally different classes 
of random process. Having already seen in section 15 d that the 
solutions cover the stationary processes of linear autoregression, 
let us in the second place consider the special equation 

( 192 ) = 
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taking’ as before a purely random process for {^(C}. Tbe solutions 
of this equation are seen to form a type case of the discrete homoge- 
neous 2 ^^'ocess (see e. g. H. Cramer (1937), Ch. VIII). In sharp 
contrast to the stationary processes, the oscillations here tend to 
increase in amplitude as time g’oes on. We may express this fact 
by saying that the homogeneous process is evolutive (cf. p. 1). In the 
particular case (192), the process {^{t)} cannot be assumed to have 
been in movement during an infinite past. Accordingly, and in 
contradistinction to the stationary case, the analysis of this equation 
generally has to be restricted to an interval of type t ^ od). 

As already pointed out, there are many problems calling for 
investigation in connexion with the general stochastical difference 
equation. The coming section is reserved for some groundwork 
concerning such equations. In accordance with the program of the 
present study, non-stationary solutions will be dealt with only very 
briefly. 


22, Some fundamentals concerning stochastical difference equations. 
According to the definition given in the previous section, 

(193) {^(0} + • {^(t ~ 1)} + • • • + • {^it - h)} = {rj (0} 

forms a stochastical difference equation in {?(©} if the coefficients 
ai are real, and if {rj (6} is a discrete random process. If au + 0, 
the equation will be termed of order h. 

Let first an equation of order h with vanishing right member be 
considered, 

(194) + % • {^it -1)} +-■+ ajriSit- h)} = 0. 

If there are any solutions to this equation, these wiU be singular 
in the sense indicated in section 14, and have (77) with m — 0 for 
relation of singularity. Being particularly interested in stationafy 
solutions with finite dispersion, it follows from the analysis in 
section 14 that there exists a non- vanishing stationary process which 
satisfies (194) if, and only if, the characteristic equation (34) has 
at least one root on the circumference of the unit circle. 

It is seen that if {^(6} and {xp(f)} are independent processes 
such that {^(0} is a solution of (193), while {Ufd)} satisfies (194), 
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then {^( 6 } + {^(0} will satisfy (194). This x^roperty of the stoch- 
astical difference equation forms another analogy to the functional 
case. 

Secondly, we shall touch upon the case when the variable 
appearing* in (193) is of the type {rjitQ] 0 } = [..., 0, 0, rjitQ), 

7]{tQ + 2), . . .] considered in connexion with theorem 1 . It is evident 
that this equation has one, and only one, solution of tyx^e C}== 

= [. ., 0 , 0 , ^(^q + 1 ), . .], and that this solution is given by the 

following system, 

+ 1) = ^ (^0 + D- • ?ao) = V (^0 + 1) -cii'V 
' ^ {t-Q + 2) — r] Hq 2) — ^ Hq -f 1) — =■ r]{tQ-]-2) — • r] (fo + 1) + 

<^ 2 ) • 7] (to), 


The general variable § 4- t) is evidently of type 

(195) ^itQ-\-t)=7] ' rj{fQ-\- 1 — 1) + 62 ‘ ^ * ” “h 

The coefficients h introduced are seen to be identical to those 
used in section 15, d. Thus, hi, h^, . ., hu wiU be obtained from the 
system (97), and the following ones from the difference equation 
(96). The following elementary theorem concerning the coefficients 
hi will prove useful. 

Theorem 8. The series ht defined hy (96) and (97) does not satisfy 
any difference equation of type (32) of loiver order than h. 

Writing ht on the form (33), let us examine the values, say h[t), 
of this analytical function taken on for t :< 0. Since every linear 
difference relation which is satisfied by this function for ^ > 0 must 
hold also for < 0 and vice versa, it is sufficient to verify theorem 
8 for ^ 0. According to the difference equation (96), we get 
m = 0 and 


hi -f % • 6(0) + a^ • 6( — 1) + * • • + ah—i • 6( — h -\-2) an' hi — h + l)=0 
h^'^ai^hi -\-a<^'hi0) " -\-aii—vhi — -fS) * 6( — A + 2)=0 


(196)- 


hh—i + ai • 6 a — 2 + % * hji—z Uh—i • 6 ( 0 ) + an • hi — 1 ) — 0 

6 a + a^ • 6 a—i + a^ • 6 a — 2 + • • • + an—i * 6 ^ an • 6 ( 0 ) = 0 . 
7 — 38387 . H.Wold. 
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Identifying the left members of the equations in the two systems 
(97) and (196), we obtain • 6(0) = ah. Since a/i 4= 0, this relation 
gives 

(197) 6(0) =1. 

Inserting 6(0)= 1 in the system (196), the (6.*— 1)^^ equations of the 
two systems considered give a/i • 6(— 1) = 0, or 6("-l) = 0. In the same 
way we obtain successively 

(198) 0 = 6(-l) = 6(-2) = • - • = b(-h-i-l). 

Now, if bt also satisfied a difference equation of lower order than 
h, it would follow from (198) that 6(0) = 0, which contradicts (197). 

We conclude from the theorem just proved that none of the 
individual components will be identically vanishing when h is 
written on the form (33). Thus, according to section 6, the two 

CD 00 

series S | 6z | and S 6| wiU be convergent if, and only if, all roots 
2'— 1 2=1 

of the equation (34) are lying within the boundary of the unit 
circle. This coroUary is important to the following. 

Recurring to the solution {^Hq; ©}, let the case be considered 
when the variables rj it) are independent, and have identical distribu- 
tion functions. Then we obtain from (195) 

e) = DM^(e) *[1 + 6,^ + 6,^ + ••• + 6?-J. 

Since D(i^i)>0, the process {^ito] ©} thus will be evolutive if the 
00 

series S 6/ is divergent. According to the above, divergence takes 

i^l 

place if one or more of the roots of the characteristic equation are 
lying on the boundary of or outside the unit circle. 

The process of linear autoregression is, by construction, a solu- 
tion of a stochastical difference equation such that (A) the variable 
in the right member is purely random, and (B) all roots of the 
characteristic equation are of a modulus less than unity. Leaving 
aside the question of whether there are other equations with 
stationary solutions — it seems likely that certain equations in- 
volving a singular and stationary {r}it)} and where all the roots of 
the characteristic equation are of ihodulus unity are satisfied by 
suitable stationary and singular processes {^(6} — this short in- 
troduction will be terminated by the proof of the following theorem, 
which shows that the condition (A) can be generalized in as much 
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as it is sufficient that the right member is stationary. In the language 
of the theory of oscillatory mechanisms, the theorem states that a 
mechanism whose intrinsic movements are damped will give rise to 
a stationary oscillation when influenced by external shocks of a 
stationary kind. 

Theorem 9. Let (193) he a stochastical differenoe equation such 
that all roots of its eharacteristie equation are of a modulus less than 
unity^ let {rj(f)} he stationary and have a finite dispersion^ and let the 
sequence 6 ^, 63 , . . . he given hy (96) and (97). Then 

lim [{rj (i^)} + ‘ 1)} -t h' {r](t—2)} + h &n * {r](t'- w)}] 

n-* 00 

will exist, and form a stationary solution of the equation. 

Observing that for any m > 0 

I>^ [hn ’ rjit — n) + hn-\-i * ( 3 ^ — — 1) H 1- hn+m '7] it — n — m)) < 

(I 1 h 1 &n+i I H h I hn+m\f * [rj it)) 0 as n CO, 

we conclude in the first place that 

lim [rjit) ‘i- h^'rjit—D-V 1 - hn‘ rj it ■?•?)] 

n-* CO 

will exist. Denoting the limit variable by ^it), an application of 
theorem 1 shows that the variables ^it) arrived at will constitute a 
stationary process {^(6}. That this process satisfies (193), may be 
proved in the same way as the identity (100). 


23. On the stationary processes with finite dispersion and with no 

singular component* 

When writing a stationary process with finite dispersion on the 
canonical form (176) it may happen that the singular component is 
vanishing (cf. section 15 y). In this section, we shall derive some 
general formulae concerning such processes. In the coining sections 
of the present chapter, we shall use these formulae, and the previous 
analysis of stochastical difference equations, for a detailed study of 
the processes of moving averages and of linear autoregression. 
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Let and {rjit)} represent two stationary processes such that 

( 199 ) = h^7]{t-2)+ .... 

(200) 7]it) = C® + + .... 

where 

(A) {'Tjit)} is non*autocorrelated, 

(B) D {i] (fj) > 0 is finite, 

(C) E[r){t)] = 0, 

(D) the sum S bl is convergent. 

Thanks to the convergence of S the formulae involving only 
the coefficients bk will all have a real meaning. On the other hand, 
the coefficients ak have been introduced in a purely formal way, 
and all questions concerning their existence will be left open for 
later treatment in connexion with the analysis of special cases. As 
a matter of fact, the expressions which involve the coefficients 
ttk will be used mainly as a formal comprehension of the special 
cases of moving averages and of linear autoregression. 

Eeplacing ^ in (199) by ^ — 1, 2, . . ., and inserting in (200), 

we obtain the following relations between the coefficients at and 6^, 

(201) ak + • ttk—i -f* • • • + ik—i * == 0, /i: == 1, 2, . . . 

If the set ial) is given, the set Q)i) thus will be uniquely determined, 
and vice versa. We obtain for the first few coefficients 



— 


( 202 ). 


W-g- 

a^~ 


&3 + 2 J 


-&3 + 2J, I -II 


—a^ + 2ai ag—3a!a2-i-al + at. 


Keeping the notations used in section 19, we winte 

(203) = + 

and remark that the vanishing of the singular component implies 
that 1/x. Thus we have 

(204) 

while the general formulae (163) and (177) reduce to 
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iT* r(^(^ + n)\ 7](t)) = hn, 

(205) n = n® = (h 4- h • + ^2 • + • • •)/A"^ 

writing shortly n* for r*®. 

Next, we shall derive a fundamental set of relations between the 
sequences {a\ (d) and (r). Multiplying the two identities 

(206) ^ (^ + &‘) = 4“ .s) + bi ' rj it -]r s — 1)4' rjit 4- 6‘ — 2) + • * * 

(207) C(0 + + ^2*C(^“~2)4-" - =7^(0, 

and forming the expectations of the resulting two members, we 
obtain in the case of a negative s in (206) 

(208) Vh 4- % • ri—i 4 • ‘ • 4- a^—i * 4 au 4 ah->ri * 4 * fg 4 • •* = 0, 

for all ^ > 0. In the same way, taking s ^ 0, we get 

(209) (1 4 ‘ 4 d^’ y\ +•••)* ® = (rj), 

and 

(210) Tk 4 • ffc+i 4 * rfc+2 4 * ** = &&/ K^y it > 0. 

Using the relations (149) — (152), we shall next derive a set of 
forecast formulae. For this purpose, we shall consider the variable 
^(t + Jc) as conditioned by the development of the process up to the 
time point t inclusive. Writing 



( 211 ) FGg{f + m--Ftm + jc)i 

where (0) = (^ (^ -— Jc) = r]it — Jc) = 7^; = 0, 1, 2, . . .), we 

have first the formal relations 


( 212 ) 


= 'y]t—ic 4 ii • rjt—k—i 4 &2 * fc —2 4 • • • ; Ih — 0, 1, 2, . . . 

jlt—k “ 4 di • ^t—k—i 4 ^2 • fc— 2 4 • • • ; 7^; = 0, 1, 2, . . , 


Since the variables rjif) are uncorrelated, (150) gives the linear 
forecast 


(213) Ft[^{i 4 h)] = 4 bk^i * 4 6^+2 * 2 H ; 7;= 1, 2, . . . 

As verified below, we have further 
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( 214 ) + ^^- 2 )]-- 

— • • • — a^w-l • Ft [C {t + 1)] — ak • — ctk+i * Ci—i — ^^+2 * — 

-•••; 7 := 1 , 2 ,... 

This relation, which makes possible a successive calculation of the 
linear forecasts Ft, reduces to (213) if every Ft[^{t Tc — ^)] is 
written on the form (214), and every l^t^i is expressed in the values 
by means of (212). In fact, for i>0 the coefficient of rit—i 
then becomes 

Cl^ * — 1 ^2 * ^k-hi — 2 ‘ ' " ^k-hi — 1 * Clk-^i, 

and according to (201) this expression equals hk+u 

Alternatively, we may express the forecasts Ft in terms of the 
values Writing 

(215) Ft [C + ®] = Ao 1 * Cm + fk, 2 ' Cm + * • * , 

we record in the first place 


Further, inserting (215) in the left member of (214), and writing 
also the forecasts in the right member on the same form, we 
obtain 

f//c, 0 4- % 'fk—l, 0 + <^2 * /m, 0 + * " + ak—l ‘/l, 0 + = 0, 

(21 6) I 1 + ^'1 • /m, 1 4- ci^ 'fk—i, 1 + • • • + ak^i 'fi, 1 4~ Uk+i = 0, 


Thus, after having calculated the coefficients fk—ij appearing in the 
forecasts Ft[(^if + k — i)], the relations (216) yield the coefficients 
fkj necessary for computing JUC(^4~ 7))] in terms of the 

The relations (213) — (215) will be referred to as the forecasting 
formulae. The sample series sections (^e, ^^- 2 , . • .) or (and) i 7 ]t, 

Vt-h • • •) being given, these formulae furnish the best linear 
forecast as to the future development of the series, viz. in the 
following sense. 

A particular sample series section, say (C)==(^i, i, • • d 
being given, there exists a set of constants / a%o( 0, A i((7), A 2 (C), . . . 
minimizing the expectation 

(217) JS Kc(f + ® - Ao(0 - - A 1 (O • &_1 - A 2(0 • &-2 - ■ ■ -f]. 
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In general, tlie constants fk^iiG) will depend on (C), and differ from 
tlie Now, considering on tlie other hand the expectation 

E it 4* Jc) — fic^ 0 * & 1 * 2 • 2 ]”] , 

onr fk,i^ possess the property that the (weighted) average of 
this expression based on all sets iO becomes a minimnm. This 
minimum is 

(218) (1 + &? + 6^ + • • • + hl-i) • i7]X 

a formula which shows clearly the scope of the forecast method 
under consideration. A forecast + Ic)] over h time units is 

decreasing in efficiency as Ic increases. As 7 ^— >go, the expression 
(218) tends to • B^irj) = (C(0)* other words, for large 7i;-values 

the forecast Fti^if + 7^)] is approximately of the same efficiency as 
the trivial forecast E[^it + Jc)] = E[^it)] = 0. 

If is purely random, it follows from (152) that we have 

Ftgit^J^] = Ecgit^M. 

In this case, the coefficients fkjiC) appearing in (217) are independent 
of (C), 

yi, iiO) ^Jk, 2*. 


It is seen that (213) — (215) then give for every (C) a forecast which 
is the best one according to the principle of least squares. 


24. On the process of linear autoregression. General developments. 

As already mentioned, we shall in the present section investigate 
in some detail the process of linear autoregression as dealt with in 
the sections 15 d and 22. Denoting the process by {^(6), the de- 
fining relation will be written 

(219) {^it- 1)} + ••• -f ak{Cit-h)} - {rjit)}. 

It will be observed that the process need not be purely 

random; the following developments are valid under the broader 
assumption that the stationary process {7]it)} is non-autocorrelated. 
As before, we shall assume that E[r]if)] = E[Zif)] = 0. 
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The formal developments given in the previons section are all 
valid in tlie present case. In fact, in the sequence ai we have 
== 0 for n > A, so the serial developments are only apparently 
infinite. In order to arrive at more precise knowledge, we shall 
next consider these developments in some detail. 

As to the Srseries, we already know from the analysis in section 
22 of the general stochastical difEerence equation that ht in the 
present case is of type (33), and that the oscillations are damped, 
and that the series does not satisfy any linear difference equation 
of lower order than h. 

Formula (209) gives 

(220) (1 + % n -f • • • + ® (^). 


As to the autocorrelation coefficients, we obtain from (208) and 
(210) the following three groups of relations. 


( 221 ) 


( 222 ) 


(223) 


n 4- 



4- * 

rk—2 4 

4- ah—i * Tk—h^i 4“ an • ^^k—h = 

= 0 


+ 


+ a2- 

n-i + • • ' 

■ + «ft-2 • ^3 + tth-i -r^ + ah- 

^1 = 0 

n + 


* TK—1 

+ a, 2 - 

rh -2 H 

■ -}- ah-2 ' 4- an-^i ‘ 4- = 

=0 

Th-l 

+ 


-2 4“ <22 • 7^h-3 4- 

• • • + ah-2 * 4- aji^i 4- an • 

= 0 

+ 




’ ' • 4- ah~-2 

• n-3 + au-i • rh-2 + • n- 

-1 = 0 

1 + <2^ • 

4" <^2 • ^2 + " • + cih— 

■2 • ?7i-2 + tth—l • Th—l + an ■ Th 

= 1/K^ 

ri + 


•r, + 

(^2 * 

-j- • • • -h ah~i * Th 4- ah • Th+i == hj 


Tk + 


* ^1*+1 

4- CI 2 * 

Tk-\-2 4- • • • 

4- ah—i • Th+k—i + ah • = 

= hk/JSy 


The first group is given in the paper of Sir G. Walker (1931) 
already referred to. We quote from the same paper the observa- 
tions that (32) constitutes a difference equation satisfied by the n- 
series for h ^ A, that this relation is the same as that satisfied by 
the 6fc-series, and that both series thus present damped oscillations 
of type (33).^ Sir G. Walker mentions further that the observa- 

^ K. Stumpff ((1936), p. 35) gives the relation hjc = n-, which is correct only in 
case = Stumpff’s deduction of the expectation relations (127) is based on 
this error. 
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tion that the &rseries satisfies the difference equation (32) has al- 
ready been given by G. U, Yule (1927) for the special case /^==l. 

The second group (222) contains h — 1 relations, and involves the 
autocorrelation coefficients rg, . ,, Th—i. Now, we obtain the 
coefficients r^, . rn—i directly in terms of the coefficients at by 
solving the system (222). In fact, assuming that the equations are 
connected by a linear relation, say with coefficients An-i, 

we get 0 = + ahAn—i ~ A^au-^ 1 == J.^(l — al); since | < 1 it 

follows A-^ = 0, and similarly JLg = • * = Ah—i = 0. The following 
r/c-values are obtained recurrently by means of the difference rela- 
tions (221). In this connexion we give the following parallel to 
theorem 8. 

Theorem 10. Let {^{f)} le a process of linear autoregression of 
order h. Then the autocorrelation coefficients of {^(i^)} satisfy no 
difference eguation of type (32) of lower order than h. 

For the proof, let us write r* on the form (33), and consider the 
values, say r(7c), taken on by this function for Tc — 0, —1, —2, ... 
According to (177), we have m = 0. Next, com]Daring the relation 

rh + • rh-i H b ah-i * r^ + au • r (0) == 0 

with the last equation in the system (221), we find ah'r{{)) = an- 
Since au += 0, we conclude 

(224) r(0) = l. 

Comparing further the first equation in the system (222) with 


m—i + 2 -f • • • 4- * ^4 4- ah—i * r (0) 4* a/i • ?’ ( — 1) = 0, 

and paying regard to (224), we obtain r(— l) = ri. This procedure 
may be continued h — I steps, which gives 

(225) r (—1) = n, r (— 2) = rg, . . ., r(— + 1) = n-i. 

Proceeding one step further, we arrive at the first equation in the 
system (223), and obtain r (— A) 4 ^ == from which we 

conclude 

(226) r{’-h)-¥rh. 
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Now, considering the function die defined for 7i: = 0, +.1, +2, 

... by 

(227) die^n-r{-Jc\ 
the relations (224) — (226) yield 

(228) d^h 4= 0 ; = ^7-7i+2 = • • • = d^i := d^^ d^=^’" = 4_i=0 ; 

dji =!= 0 . 

On the other hand, it follows from (227) that dk satisfies a difference 
equation of type (32) of an order twice that of the equation satisfied 
by r/j. However, by the argument used in theorem 8 it follows 
from (228) that dk cannot possibly satisfy an equation (32) of lower 
order than 2 h. Thus Tk .will satisfy no equation of type (32) 
of lower order than Ji, a reflection which completes the proof. 

Among the general developments in section 23 there remains for 
consideration the forecasting formulae. Since ak = 0 for Jc> h, 
formula (214) shows that the forecasts Ft[^{t + 1)], Ft [C(^ + 2)], . . 
Ft [^(7 + 4)], . . . , concerning the time points Z 4- 1, 7 + 2, ....,74-7?, 
. . forecasts based on the development up to the time point 7, 
will satisfy the equation (32) in respect of h. Thus, the forecasts, 
too, will form a damped oscillation. Explaining in the terminology 
of the theory of oscillatory mechanisms, the forecast curve describes 
how the mechanism would develop out from the situation arrived 
at in the time point t if there were no external influence pre- 
sent in the future time points 74-1, 74-2, .. . As by hypothesis the 
intrinsic oscillations of the mechanism are damped, the forecast 
curve will also be damped, in full agreement Avith the above. Thus, 
+ = as;[:->oo, in agreement with the concluding 

remark of section 23. 

Considering, finally, the relations (216), it will be observed that 
the coefficients fkj for all ^ satisfy the difference equation (32) in 
respect of k. 

Next, we shall illustrate some points of the general analysis in 

Chapter II by means of the process of linear autoregression, 

00 

Keeping in mind that in the present case S | Tk i is convergent, 

we may apply the corollary to theorem 5 (see p. 69). We conclude 
that the generating function W{x) for all x in (0, 7t) has a bounded 
derivative given by (121). In order to transform this expression 
upon a finite form we write 
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(t (jc) = S f'k • = e*®* • S p«+s ■ e'**- 

Z:=0 t= — s 

Considering the identity 

G {x) • 4- ai • -i h au-i • e~'^^ + an] == 

= i rt+K • • i + • • • + 

t^~h t=^—h+l 

+ • S r<+i • + an- H n • 

f=— 1 i=0 

and paying regard to the relations (221), the right member re- 
duces to 

+ toi + 7 \) • -f- (^2 + + rg) • ^ . 4 

+ (^'A— 1 + ^A— 2 4- • • + ?*A--l) * 6“^^. 

Thus we obtain fffe) on the finite form 

r' )— 1 + * • • 4- (an-i + ah -2 4- • • +a^ rh -2 H- ^ 

1 4“ % * 4- a2 • 4 b an ' 

while W' ix) is given by 

(229) IF' {x) = G{x)-\- G{-x)- 1=^2 ‘B[G{x)]-l, 

where Ii.[G{x)] stands for the real part of Gix). 

Denoting by Xi the roots of (34), the roots of l4-%a?4“---4- 
4-a/tX^== 0 are seen to equal Hxt, Since these roots are lying 
outside the periphery of the unit circle, the denominator of G{x) 
is evidently non-vanishing, which is in full agreement with the 
earlier observation that W'(x) is bounded. Now, paying regard to 
the relations (127) and (129), and summing up the main results, we 
obtain the following theorem. 

Theorem 11. The generating function W{x) of the autocorrelation 
coefficients in a 2 ^'^'ocess {^(©} of linear autoregression is absolutely 
continuous, and has a hounded derivative Wfx) given hy (229). The 
expectation EW^hi] ^)] of an arbitrary ordinate in the Schttstek 
periodogi^am, as defined hy (126), is given hy 

4 

E{G\n\m^^—^W\l)^t^(Xln). 

n 
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It should be noticed that the expectation is of the same order 
of magnitude as in the case of a purely random series (cf. (37)). 
When extending the analysed series, the periodogram ordinates 
thus tend to vanish. 

In view of the applications of the theory of linear autoregres- 
sion, it is of fundamental importance to investigate the possibilities 
of drawing conclusions from a sample series ^/—i, ^^— 2 , . . .) upon 
the characteristic equation (191), or — interpreting in the language 
of the theory of oscillatory mechanisms — to investigate whether 
the past development of the mechanism as influenced by random 
external factors can give any information about its intrinsic oscil- 
latory tendencies. In particular, is it possible to find out the 
periods of the intrinsic damped oscillations? 

The previous analysis shows that the classical periodogram ana- 
lysis is an inadequate method for the search of intrinsic oscillations 
— the longer the series analysed, the poorer the results. This con- 
clusion holds both for the Schuster and the Whittaker periodo- 
grams since they are of equal efficiency (see section 8). In the 
illustrations given in the next section, these periodogram questions 
are dealt with in more detail. 

As has been emphasized by Gr. XT. Yule (1927) and Sir G-. Wal- 
ker (1931), an adequate tool for the search of the intrinsic pro- 
perties of an oscillatory mechanism is yielded by the serial coeffi- 
cients of the time series investigated. In fact, a serial coefficient 
f]c approximates the corresponding autocorrelation coefficient n-, and 
we know that the graph of presents exactly those damped oscilla- 
tions which are characteristic of the mechanism considered — there 
is conformity in respect to both the frequencies of the individual 
components of the oscillations and their damping exponential fac- 
tors. Since a periodogram analysis is concerned with only the 
intrinsic frequencies, it is of particular importance that the serial 
coefficients can give information even about the damping factors. 

Having above derived expressions for the autocorrelation coeffi- 
cients and other characteristics connected with a process of linear 
autoregression, we are, in view of the applications, confronted with 
problems of an inverse type. In particular, it is seen that if the 
autocorrelation coefficients of a process of autoregression are known, 
the coefficients (a) will be obtained from the system (221 — 222). 
After having derived the coefficients (a) we obtain the primary 
process {7]{t)} in terms of the process considered by means of form- 


Ill 24] ON THE PBOCESS OF LINEAB AHTOEEGBESSION 109 

Tila (200). Concluding* that the inverse problem mentioned involves 
no difficulty in point of principle, reference is given to the next 
section and Chapter IV for illustrations. 

A question now presenting itself concerns the reliability of the 
information yielded by the serial coefficients. Here we meet at 
once the same obstacles as in all significance problems concerning 
autocorrelated time series. In the first place, we notice that when 
forming the sampling variance of a serial coefficient, we arrive at a 
complicated expression involving i. a. an extensive sum of correlation 
coefficients between different serial coefficients. The difficulties 
of the problem having already been mentioned by G. XJ. Yule 
(1927), E. Slutsky (1934) has presented a large collection of for- 
mulae concerning the dispersion of various characteristics derived 
from sample series sections, formulae derived under the assumption 
that the variables considered are normally distributed. 

However, it should be observed that the relevant problem does 
not consist merely in calculating the variance or the distribution 
of a single autocorrelation coefficient. We have also to face the 
much deeper question concerning the reliability of the periodicities 
which present themselves in the graph of the serial coefficients. In 
view of the complications already occurring in sampling problems 
involving merely one individual serial coefficient, the possibility of 
arriving at a practicable, quantitative measure of significance in 
this connexion seems, at least for the moment, hopeless. 

There is also another essential difficulty. Correlation in time 
series, and correlation as considered in the classical applications, 
differ as to their quantitative significance (see H. Wold (1936)). As 
a matter of fact, in the former case the correlation coefficients are, 
as a rule, quantitatively conditioned by the size of the statistical 
masses to which the coefficients refer. In order to give an example, 
we advance that certain business cycle data yield nice graphs of 
serial coefficients (see Chapter IV). For instance, the serial coeffi- 
cients of the G. Mybdal index 1830 — 1913 of the cost of living 
in Sweden show a clear damped oscillation (see e. g. fig. 14). Let it 
now be imagined that another index had been computed by the 
same method, covering the same period — 84 years — , but refer- 
ring merely to a small part of Sweden. In the second case there 
would, of course, be a larger random element present in the index. 
A short reflection will show that the increase of the random element 
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is accompamed by a systematic tendency to a diniinisbing* of the 
serial coefficients. — Consider, on the other hand, a classical 
application snch as the correlation between cranial indices. Here 
the statistical units, the skulls, are uniquely determined, unmodifialle 
— in a material of say 84 skulls the correlation coefficient can be 
calculated in only one way. In other words, while we cannot 
possibly form a coefficient referring to a certain part of each of 
the 84 skulls, there is nothing illegitimate in the modifying of the 
84 statistical units in the case of the time series correlation 
considered above. Advancing that the effect in question is treated 
in some detail in Appendix B, we point out for the present that 
in a theoretical autocorrelation model the size of the statistical 
mass must also be taken into consideration. 

Summing up, the intricacy of the problems of significance, and 
the dependence of the correlation coefficients on the statistical mass, 
constitute two fundamental difficulties in quantitative autocorrela- 
tion analysis. The situation seems to justify the opinion that the 
utmost caution is necessary when drawing quantitative conclusions 
from observational time series — a hypothesis should not be 
considered safe unless corroborated by empirical series obtained 
from different and, if possible, independent statistical masses, and 
supported by aprioristic considerations independent of the statistical 
evidence. 

The applications to observational data presented in Chapter IV 
are far from aiming at quantitative results, the purpose being 
more to exemplify the qualitative differences between the scheme 
of hidden periodicities and the schemes of linear regression. Since 
we attach no importance to the quantitative outcomes, the signi- 
ficance problems will not be entered upon in the present study. 


25. On some special eases of linear autoregression. 

Keeping the assumptions of the previous section, we shall in this 
section consider the special eases obtained when putting /i == 1 and 
h = 2 in the general definition (219) of linear autoregression. The 
resulting formulae will be readily survey able, and give rise to a 
few remarks of general scope. In a few instances the model series 
given in section 15 will be used for the illustrations. 
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Attaching* the analysis to the formula (219), we take A = 2, and 
denote the roots of the characteristic equation (191) by p and q. 
Thus we have 

(230) %==■—• (y? + q), a 2 ==i> • q, an — 0 for n> 2; 1 2? [ < 1, | ^ | < 1 ; 

era - (p + g) • e[^ ~ 1] + jp • g • ^[t-2] =7][ti 

As the coefficients must be real, we notice that either I. p and q 
are real, or II. p = A + q== A — iB, where A and B represent 

real numbers such that 

(231) A' + .B' = ipP = |gp< 1. 

We shall assume that p 4= gaining* thereby the general solution 
of the difference equation (32) to be (cf. also p. 146) 

(232) A-p^ + Pa-s^ 

where Pi and Pg are arbitrary. In case II, this expression may 
be written (cf. (33)) 

Qi • cos Aj ^ + Q 2 ' sin t. 
where Qi and Q 2 are arbitrary, and 

G= + YA^VK-; cos = ^/< 7 , 0 < < 7 c . 

Cases I and II will be dealt with separately. 

I. p and q are real. 

Inserting the general expression (232) for hi and hj, in the system 
(97), and solving for the constants P^ and Pg, we obtain readily 

(233) h = — ^ • / + — ^ • q’= = (/+' - 3'^+^)/(» -q),7c^ 0. 

Next, insertion of this expression in (203) yields 

-O'® 1+pq 

D^iTp {l—p^){l — q^){l—pqy 

The system (222) reduces to the single equation 

(234) Vi A Ui A a^'Ti = 0. 
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Solving* for observing* that Tq = 1 (cf. formula (224)), equalling 
these two cofficients to the general expression (232) for Tq and 
and solving the linear system thus obtained for the constants 
and P 2 , we get 


(235) 


■ri 




{p — g) (1 + 


•p^' + 


g(l — j/) 

(g —p) (1 + pq) 


• > 0 . 


It is readily verified that (205) is satisfied. Using (230) and (235) in 
the formula (229), we find for the derivative of the generating 
function in the interval (0, 7t) 


(236) 


(1 (1 — q ^) (1 

(1 4- yjg) (1 4 p'^ 2p cos 1) (1 4 — ■ 2q cos 1) 

1 

'{1 4 — 2p cos X) a + ~ 2g cos X) 


and a short calculation wiU verify (121). 

Theorem 11 gives for the expectation of an arbitrary ordinate 
in the Schuster periodogram 


P[C'(w,A)]-- 




nil + p^ — 2p cos /I) il + q^ — 2q cos X) 


4 Oil/n). 


The following special cases are instructive. 

1) g == 0. In this case the relations reduce to 

(237) 

(238) dk = iQ = irj)/il —p)% Tk = ^ 0, 

I— p^ B^irj) 


jrm 

1 1 p~ — 

(239) E[GHfhX)]^ 


1 4 — 2i> cos X B^iQ-il + p^ — 2p cos X) ’ 

4.BHV) 

n • (1 4 — 2p cos X)' 


These formulae, which cover the case = 1, have been given earlier 
by Sib Gr. Walker ((1931), p. 521).^ 


^ Sm G. Walker gives also the variance formula (220) for an arbitrary h (for- 
mula K). 
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2) ^ = — jp. We obtain without difficulty 

era = ■»?[«. 

hk = r' 2 k=p'^’‘, hk+i = I'ik+i — 0, Tc>0 
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Fig. 2. Generating function derivaUves obtained from formulae (236) and (239). 


(240) Tr(;i) = 


1 


(1 .+ p) ^ — 4 • cos‘^ 2’ 


E[C^ (H, 2)]^ 


AD\ri) 


n • [(1 + fr) “ — 4 • cos^ 2] 


+ 0(1A2). 


The graph above shows the curves TF'(2) which belong to the 
processes defined by 

a) formula (103), i. e. j? == '8, g==0 (thin line), 

b) » (104), i. e. jp = *8, g = 0 (thick line), 

c) f) == '8, g = *- 13 (broken line). 

The graph contains, for comparison, the line IF' (2) == 1 which 
represents the derivative of the generating function in the case of 
a purely random process (cf. the remark attached to formula (131)). 

,8 -" 38387* H. Wold. 
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II. p and 

q are mijugate complex, 

(241) 

p == A -f ^ J?, q = A — iJB, 

Since the above developments are perfectly general, we have now 
only to insert (241) in the formulae for hk^ 7% etc. given under I. 
After elementary transformations we get for 0 


cm -2A-C[t-1]+ U' + JB^-C[t—2] = r] [fl, 

(242) 

A 

h = cos JcX^ -h sin Ic 2^, 

(243) 

A 1 / Y2 

rj. = cos Ar sin k 2^^, 

(244) 

1 _L ^2 

^ ■“ (1 - C^) (1 + -2A^ + 2B^) ^ 

Writing 

J' 

'Xi = 4 - cos A j' + ^ (1 - C-)\ 

we get 


(245) 



and 

i? (M, A)] = + 0 (1 /«). 

H * rj \A,) 


Possessing’ now a collection of formulae sufficient for our purposes, 
a partial check is obtained bj observing* that when putting A = 0 
in the above expressions, we get the same result as when replacing* 
jp by i • p in the formulae I, 2. 

Proceeding to a first application of the above developments, we 
observe that the formulae under II cover the case of an oscillatory 
mechanism whose intrinsic oscillations consist of a single damped 
harmonic with frequency lying in the interval 0 < 2^ < tt, and a 
damping factor OK Having earlier found that a periodogram 
analysis is ineffective in the case of linear autoregression, we are 
now in a position to prove the statement advanced in section 21 
concerning the dangers of the periodogram method (see p. 95). 
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Speaking* generall}^ tlie question is whetlier tlie expectation of 
the periodogram ordinate {n, 1) presents a maximmn when l=li, 
i. e. when the abscissa of the periodogram equals a frequency 
characteristic to the intrinsic oscillations of the inechanisni. Were 
the answer in the afEirinative, we should — at least in principle — 
be able to use a periodograni construction for the discovery of the 
intrinsic frequencies lu In fact, considering a sample series 
• • •), niaking a sequence of periodograms on the basis of 
sections of type (4Vi, • • and forming 

for every abscissa I the 
average of the correspond- 
ing sequence of periodo- 
gram ordinates, the re- 
sulting curve would pre- 
sent maxima in the points 

1 = li sought for. How- 
ever, in order to prove 
that this way is barred 
— in point of principle, 
thus even if the statistical 
material were extensive 
enough for the construc- 
tion of the required set 
of periodograms — we 

need only consider the Fig. 3. Unit circle slioioing the domains tvhere 

case of a sini^le intrinsic f »n-en hij (245) presents one or two 

” maxima [nomdotted and dotted domains respecnveiip 

oscillation covered by the 

formulae (II) above. The expectation E [C^ (??, A)] being asymptotically 
proportional to the derivative W' (A), it will be sufficient to investigate 
the extremes of the latter function. 

Evidentl}^, W' (A) as given by (245) is maximized by those A-values 
which minimize the auxiliary function J (A), and vice versa. The A- 
values in question are seen to be 

(24b) A = arc cos — — , A == 0, A = tt. 

The behaviour of IF^(A) being different in the cases | A | “ 2 0^/(1 + C^), 
the accompanying figure shows the part of the curve \A\ = 

2 OVd + C“) lying in the unit circle (7= 1. An analysis of the 
second derivative of J (A) gives the following results. 
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a) li-l < 2(7'/(l+C^. Refen-ing- to the figure, the roots A±iB 
are in this ease lying in the non-dotted part of the unit circle. 

W'{1) presents one maximum in the interval (0, n\ viz. in the 
point X defined hy 

(247) 4 = arc cos ^4(1 + C^)I2G^, 

and two minima, viz. in the points 4 = 0, and 4 = tt. 



Fig. 4. Grenerating function derivative obtained from formula {245). 


Ilhistration. Considering, for example, the process of type II obtained by 
taking A = * 8, j5 = '4, wegetC^ = ‘8, A < 2 0^/(1 + 0^). In Ml agreement 
herewith, the point U, .B) = (*8, *4), whieh is plotted in hg. 3, is lying in the 
non-dotted part of the nnit circle. It follows from the above that the corresponding 
function presents bnt one maximum in the interval {0 < 7t). This 

function W' {X) is shown in the figure above. 

b) \A \ ^ 2CV(1 + C\ TKe roots are lying in tbe dotted part 
of tbe unity circle. 

IF' (1) presents one maximum and one minimum, tbe former being- 
attained for 2 = 0 if jd > 0 and for 2 = jr if ^4. < 0, tbe latter for 
2 = TT if JL > 0 and for 2 = if A < 0. Tbe curves W' (x) are here 
similar to those obtained in tbe case 1, 1 (cf. fig. 2, unbroken lines). 

Eougbly speaking, if tbe roots A±iB of tbe cbaracteristic 
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equation of tlie oscillatory mechanism are lying close to the 
periphery of the unit circle, the maximizing 2-value (247) is seen 
to approximate the intrinsic frequency 2^ = arc cos Aj C. In other 
words, if the intrinsic oscillations are only slightly damped, the 
periodogram analysis suggested above will be able to discover the 
frequency of the intrinsic oscillation. 

On the other hand, holding A/B fixed, and letting C leave the 
immediate neighbourhood of the periphery of the unit circle, the 
2- values resulting from (247) will deviate more and more from 
2i = arc cos A/C. In case A > 0, the value given by (247) is seen 
to be less than 2i, while the reverse holds true if A < 0. Thus, 
the periodogram will show a tendency to over-estimate the intrinsic 
period if this is above 4 time units, and to under-estimate it if the 
period is lying between 2 and 4 units. As to periods below 2 
time units, these require a finer equidistance (cf. section 4). — We 
conclude next that the bias will be the larger, the more heavily 
the intrinsic oscillation is damped, i. e. the smaller the damping 
factor C is. Further, excepting the cases of an intrinsic period 
equalling exactly 2 or 4 time units, and letting the damping factor 
pass below a well-defined limit, the periodogram will altogether 
cease from giving information about the intrinsic period. 

The disturbing efEect pointed out may be looked upon as caused 
by the external factors influencing the oscillatory mechanism. As 
by hypothesis these factors contain a random element, I propose 
the term chance effect for the bias in question. 

The situation may be described by saying that the inference 
drawn from the characteristic equation of the intrinsic oscillations 
does not apply directly to the oscillations of the mechanism when 
influenced by random external factors (cf. p. 95). While the chance 
eftect is easily surveyed in the case of one intrinsic damped har- 
monic, the state of things is far more complicated when the mechan- 
ism presents a tendency to composite oscillations. Having stated 
this, our analysis of the chance effect will be brought to an end 
by a few explicit illustrations. 

Illustrations. We shall first consider the case of one intrinsic oscillation, with 
damping factor 0-=]/^ == *894, and frequency 2^ = arc tg Va = 26°, 56. This case 
is elucidated by the diagrams 8 and 4. Inserting A = ‘8 and 0=]/^ in (246), the 
2-value maximizing W^(2) is found to be 2= arc cos ‘9 = 25°, 84. The periodogram 
thus tends to deliver a period equalling 360/25 *84=13 *93 time units, while the 
intrinsic period is 13 *55 time units. In agreement with the general results concern- 
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ing‘ the case of a non-composite intrinsic oscillation, the period is oyer-estimated. 
Although the damping factor is fairly large, just below unity, the bias is rather 
important. 

Proceeding to some examples of periodogram ordinates derived from the model 
series given in section 15, let ns first consider the process as defined by 

(103). Continuing in the notations of the present section, we obtain by short 
calculations /i = 1, ^?== — ‘8, D" (7j) — '6y Z)^(p = 1*667. Speaking in the 
language of oscillatory mechanisms, we are concerned with a single intrinsic har- 
monic, the frequency, period, and damping factor of which are Tt, 2, and *8 respect- 
ively. An ordinary periodogram analysis gives negative results, the expectance being 
inversely proportional to the length of the series analyzed. However, we know 
from the general theory that the expectation of the periodogram ordinate in I = 
is larger than the expectation in a purely random series. Taking 7i = 20, and 
keeping in mind that we are dealing with the exceptional case /I = tt, the latter is 
found to equal l*667/n = *083. The former, as given by the exact formula (127) is 
iI[C^(20, 7r)] = *59. The proximate value obtained from (239) equals *75. Now, 
the 1000 elements in the model series ((J^^^) have been arranged in 50 sections, each 
containing 20 consecutive elements, and the periodogram ordinate (20, tv) has 
been computed for each section from formula (27). The average of the 50 ordinates 
thus obtained equals ‘56, a value not far from the the expectation *59. 

Considering next the process {(5^'^^(t)} given by (104) we have 7i = 1, *8, 

2)2 (^)_- 2 ^ 2)^(r) = *556. Keferring again to Fig. 2, it is seen that for small 
frequencies the expectation is larger than in the case of a purely random series, but 
as before of the same order of magnitude in respect of n. Considering, e. g., the 
Fourier coefficients (25) for k = l in a sample series section of 20 elements, we have 
n = 20, and 2 = 2 Tthi = 18°. Formulae (127) and (239) give respectively 
AJ [0^(20, 18°)] = *36 and AJ [0^(20, 18°)]-^ *34, while the corresponding value in the 
purely random case is *11. On the other hand, operating on the model series ((5^^^) in 
the same way as before on we have arrived at an average periodogram ordinate 
AJ [0^(20, 18°)] equalling *51. The rather substantial deviation from the expectation 
suggests that the periodogram ordinates are subject to a large dispersion. Actually, 
this suggestion seems to indicate how the matter stands. At any rate, the distribu- 
tions of periodogram ordinates I have constructed on the basis of model time series 
have all presented a pronounced skewness, and a very large dispersion — often two 
or three of the largest sample values («, 2 ) constitute alone as much as 20 to 30 
per cent of the sum of the 50 sample values in the materiaL An instructive example 
of this is given below. 

Recurring to the series ((5/^^), formula (127) yields AJ[C^ (20, 18°)] = '046, while 
(239) gives AJ [0^(20, 18°)] ^*038. A periodogram analysis as described in the 
previous illustration has given (20, 18°)]= *058. The figure below shows the 
distribution of the 50 averaged sample values of the periodogram ordinate dealt 
with. The figure contains the curve of summed relative frequencies {F, thick lines), 
and a histogram showing the frequencies in the classes 0 — *01, *01 — ‘02, etc. (/, 
broken lines). The frequency curve graduating the histogram has been drawn by 
hand (broken curve). The distribution is seen to be very skew, and tlie histogram 
suggests no pronounced maximum for the frequency curve. 

It is interesting to notice that the aforementioned investigation by E. Slutsky 
(1934) on sampling iiroblems in the case of normally distributed time series data is 
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likewise discouraging' with regard to the reliability of the periodogram. — As 
emphasized by Sm G. Walker (see. e. g. D. Brunt (1931) and K. A. Fisher (1929)), 
particular caution is in place when applying the periodogram method without the 
guidance of aprioristic knowledge of possible periods. In fact, when letting the 
largest ordinate indicate the frei^uency sought for, it should be taken into considera- 
tion that this ordinate has a larger expectance than a fixed ordinate, or a randomly 
chosen ordinate. 



Fig, 5. Distrihutmi of periodogram ordmafes derivecl from the model series {see 

table 3). 


I^ext, we shall give a few applications of the inversion formula 
(200). Working on the model series given in section 15 (J, we shall 
exemplify the construction of the primary series from a given series 
of autoregression. 

Illustrations. Considering the model series (d^^^) as defined by (103), we have 
^ — -j- *8 djili for f=l, 2, .. . Inserting successively the d'-values given in 
table 3, we get = I’OO, = '20 + *8 X 1'00=1*00, = — '16 -f' 8 X '20= 

= '00, etc., in full agreement with table 1. Thus, knowing the autoregression 
coefficient = ' 8, we can reconstruct the primary series of the autoregression 
series examined. In the same way, we can reconstruct without difficulty the primary 
series of the model series (df/*^^) and (dt*^). 

In view of the applications, an important problem is to find the coefficients (a) 
belonging to a given series of autoregression. This problem will be discussed in 
detail in section 32. For the moment we shall only remark that the relations (222) 
together with the last I'elation in (221) form a system of linear equations which will 
permit us to derive the coefficients ah in terms of the autocorrelation 

coefficients rh* Thus, identifying the coefficients rk with the serial coefficients 

rjc of the series examined, the linear system will give us a set of proximate 
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coefficients aji- We can then derive as before the primary series which cor- 

I’esponds to the coefficients aj^ arrived at. 

We give below the first five serial coefficients of the model series (d), and the 
corresponding autocorrelation coefficients as derived from formulae (238) and (243). 


Table 5. Serial and autoeor relation coefficients of the model series {d). 


/£ = 

1 

2 

3 

4 

5 

f^;: = 

-•786 

*647 

-*498 

*375 

-*291 

l’'l = 

-'800 

*640 

-*612 

*410 

-*328 


*862 

•700 

*580 

*482 

*398 


•800 

*640 

*512 

*410 

*328 

jh- 

*127 

-*628 

-*194 

*397 

*186 

l’-.= 

*121 

-*626 

-‘204 

*366 

*206 


It is interesting to notice that although the model series consist of as many as 
1000 elements each, the deviations between empirical and hypothetical correlation 
coefficients are rather substantial (cf, also p. 50 and p. 109). 


Our analysis of the process of linear autoregression will be 
concluded by revealing a connexion with the » sinusoidal limit 
theorems» of E. Slutsky and V. Eomanovsky touched upon in 
section 16. 

Let 

L{x) = x^^-2Ax^ 1=^0, -1 < 1 < 1 , 

stand for the characteristic equation of a simple harmonic 
Pi cos Jli ^ + Pg sin t, 

and let {^^^^(0}, {^^“^(0},... represent a sequence of processes of 
linear autoregression defined by 

[t] - 2 Ap - 1] + c; [f - 2] = [tl 

Let it further be assumed that the processes have equal dispersion, 

P(C(^U©) = cy, 

and that 

lim Ap = J., lim Cp == 1. 
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Attaching (p) to the symbols referring to the processes we 

obtain from (204) and (244) 


4 = 


1 -f- Op 

a - (7|) (1 + - 2 a; +“2^ 




Since by hypothesis 1 — Op tends to zero as it follows that 

Kp'-^cx) as j 9 “-»oo. Concluding nest from (242) that is bounded, 
it follows that, for all 7^, lim &^^VJAp = 0, Considering now the sys- 

P *30 

terns (221) — (223), it is seen that 

(248) lim = 0, — oo < 7^; < co. 

p — »00 

The relation (248) thus arrived at implies that the sequence 
is ruled by the sinusoidal limit theorem (cf. p. 64). Thus, 
approximating an arbitrary sample series • • •> ^ 

suitable simple harmonic with frequency say ccp it, 'Q, and holding 
n fixed, it follows that for every 5 > 0 

lim P[\xp (4 + 1, D - r*fii I < fi, . . kp (fo + «, D - \<e] = l. 

P * 35 

Since the proving arguments are of a general nature, they will 
be found to apply also in the case of a relation L (a?) = 0 of arbi- 
trary order. The process of linear autoregression thus forms a con- 
venient starting point for the construction of sequences covered by 
the sinusoidal limit theorems.^ Obtaining in this way a continuous 
passage to a singular process, it is of interest to notice that the 

CO 

formula (199), together with the observation that S tends to 

infinity with p, forms a sufficient basis for proving that the singular 
limit process is normal. 


26. On the process of moving averages. 

Proceeding in the present section to an analysis of the process 
of moving averages as defined in section 15 p, we shall first apply 
the general formulae in section 23. The questions of when a re- 
presentation of type (200) holds will give rise to a discussion elu- 

^ A sequence of moving averages cannot be ruled by the sinusoidal limit law 
unless the number of weights tends to infinity (cf. p. 65). 
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cidating certain points of the autoregression analysis as presented 
in sections 19 and 20. 

By definition, the general formula for a process {^(0} of moving 
averages reads 

(249) IC®} = lh{7]{t-JD}, 

where is purely random, and the sequence is 

real. As before, we shall assume that D(r]) is finite, and that 
E[rj]=Q, Thus (249) forms a special case of the variable defined 
by (199), and it follows further that the formal developments remain 
valid if {r]{t)} is non-autocorrelated. 

In the process of linear autoregression, the autocorrelation coef- 
ficients and forecasting values were found to follow certain damped 
harmonics. In the present case only the first h elements in the 
sequences mentioned are different from zero. In fact, (204) and 
(205) yield for a process {^®} given by (249) 




(250) = 


(bk + bx * bk-^i + ♦ • * -F • bh-^k)l (1 -F • -F bl) for Jc ^ 

0 for Jc > h, 


where ^ > 0 and bQ= 1, formulae given by Professor H. Cramer 
in his 1933 Course. Next, (213) gives 

bjc • ■r]t + bk+i • r]t~-i + • • * + • rjt^h+k for 0 ^ Jc ^ h, 

0 for Jc > h, 


Ftg(t+Jc)]==^ 


where the forecast is based upon the condition 

iC) = [t] ® = Tjt, 7jif--l) = rjt-u . . r]{t — h + 1) == 


Inserting (250) in the general formula (116) for the generating 
function W(x) of the autocorrelation coefficients in a stationary 
process, we conclude that W(x)--x reduces to a finite trigono- 
metrical sum. Thus, in the present case W' {x) exists, and is like 
W(x) uniformly bounded. Consequently, replacing »linear autore- 
gression» by »moving averages» in theorem 11, we get a corres- 
ponding theorem covering the present case. We conclude, i. a., 
that periodogram analysis is an inadequate method of research in 
the case of moving averages also — the expectance of an arbitrary 
periodogram ordinate is inversely proportional to the length of the 
series under analysis.. — Formula (256) gives IF'®) explicitly. 
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In view of the applications, the following questions call for in- 
vestigation (cf. the illustrations on p. 119). The autocorrelation 
coefficients of a process of moving averages being given, is it 
possible to derive the primary process and the coefficients {h) of 
the moving average? In particular, does there exist a relation of 
type (200) yielding the primary process? If the answer to the 
second question is in the affirmative, how can the coefficients {a) 
be obtained? The first of these problems was formulated by Professor 
H. Cramer in his 1933 Course. 

It is suitable to start the analysis with the second question. 
Denoting the process considered by {^(0}, we shall, until further 
notice, assume that {^(0} is given by (249). 

In the first place we observe that theorem 9 covers the case 
when aU the roots of the equation 

(251) -t h hh—ix + = 0 

are of modulus less than unity. The theorem states that if this 
condition is satisfied, an infinite sequence {a) = . . .) such that 

(200) holds is given by the system (97) and the difference relations 
obtained for at when replacing the a/s by the S/s in (96). These 
relations constitute a difference equation of order h satisfied by the 
sequence (a), and it should further be observed that the sequence 
{a) arrived at forms a composed damped harmonic. 

We have found that the condition attached to the roots of (251) 
will secure the representation (200). Under these circumstances the 
relations (208) and (210) derived from (206) and (207) must hold 
true. Explicitly, these read in the present case as follows: 

(252) a^ih-hkTh + 1 r/i— 1 + • •• + -f an+i + 

"t cth+k—l + • • • + ajz^i Th—l + = 0, 


1 4- * - • + tth+l + a/i -f 

4 - ah—1 ? 1 4 - ‘ Th^i 4 - vji — 0 

<^ 2 / 1—1 Th + 2 1 4 - " * 4 - 4 - ah—1 4 - 

4- ah— 2 4- ‘ • 4” Th—2 + ?7i— i = 0 


an+i rji 4 - ah 4 - a^ 4 - 44 == 0 


(253) 


124 


ANALTSIS OP STATIONAKY TIME SERIES 


[III26 


(254) 


-ah 

rji ■+ 

Gh-l 

n-i 

4- ... 4 

Gi 

7-1 + 1 = HK^ 

ajt 

~i rh 

4 ... 

+ Gi 


= 

hjK^ 


n + 

ai n- 

-1 + 

rA-2 = 

Ik- 


Gi 

n + 

Th—l 

= hi- 

-JK^ 



Th 

= h/K\ 






Comparing the above coefficients at and the coefficients at n = — 
yielded by an autoregression analysis of {C(^)} as described in section 
19, and letting £ > 0 be arbitrarily given, it follows that there exists 
an n such that 

-f- ^ {^(t) -h an^(f-n))^ 

^ (C (e + am C (^“T- 1) + • • • + (0, 

the second inequality resulting from the definition of the coeffi- 
cients atm and the third being similar to (182). Inserting (249), 
an elementary transformation yields 

£ > (a^ + + (^2 + 6^ + 62)^ . . -j- 

+ (an + a^^~l + • • + Un—h ih)^ + 1 - (an ^hf ^ 

^ (ai n + + (^2 + ai n hi + h^^ . 4. 

+ (ann + ^71—1,71 61 + • • + an—Ji,n -j- . . + {ann hhf ^ 0. 

Concluding that \ain — ai \ < d'V a, where Ct is independent of 
it follows that ain-^ at as Moreover, forming the difference 

of the residuals based on (a^, . aj and (am,.., an?i), a simple 
application of Schwarz’ inequality shows that the non-autocorrelated 
process yielded by the autoregression analysis is identical with the 
primary process. 

Considering the singular case when (251) presents at least one root on the bound- 
ary of the unit circle, we conclude from theorem 8 that the series S a| will be 
divergent, and that no representation of type (200) will exist. By elementary trans- 
formations we shall show that this obstacle to calculating the primary process may 
be removed by means of a limit passage. 

Let (249) represent a moving average, mn the roots of the characteristic equation 
(251), and let j cc^i [ ^ 1, Introducing auxiliary averages (t)} defined by 


W = V] (f) + • -q it — 1) + • • • + -qit — hX i== 1, 2, . . 



11126] ON THE PROCESS OF MOVING AVERAGES 125 

let the symbols referring to (f)} he marked by (i). Let = Xn when \xn \ <1, 
and let Xn — = z//, | | == 1 — Si when | | ~ 1, where 0 < £^• = | z/^ | ~> 0 

as i 00 . 

By constrnction, the inversion formula (200) applies to 

r] (t) = (t) 4- (^ ~ 1) + it — 2) + • • • . 

The limit rela^tion desired is based on the coefficients and reads 

(255) {7]it)}=lim [{^(0} + {C(f-l)} + af^ • {^(^-2)} +••■]. 

-*00 

Writing {r] it, i)} for the process under the limes sign, it is sufficient to prove that 
Bi = \ri it, i) — T] it)) is of the same order of magnitude as Si (see p. 40). 

Paying regard to (198), we get Z)| = S (X^^^])^ where 

L [x (t)] = cc (t) + b^’ x(t — 1) 4- " • + bh—i 'x(t — h -h 1) 4- hh' x{t — h). 

Write next on the form S == S (f) • (ccjj^)^, where 11^^^ is a polynomial of 

n 

order hy and hn the multiplicity of the root Xn (cf. (33)). Inserting in X|, the 
terms in S of type t^-ix^^f, where = will cancel out because they satisfy the 
relations L [t^ * (ccj^yj = 0. It remains to estimate the terms involving * (ir^J^)^, 
where = Xn — According to the inequality of Schwarz it is evidently 

a 

sufficient to verify that, for any of the A:-values in question, 8 = [L [t^ • (x^i — 

j:=i 

— ^ 0 as i — > 00 . 

To prove this, remove the factors (xn — from ' L [t^ • (xoi — ^if], and 

develop the remaining terms (xn — ^ 2 )^ according to the binomial theorem. Then 
we get 

S=%ixn- ■ [Pit, k) + P, it, k)-Ji + --- + Pkit, k) ■ J\?, 

where 

p it, k) = cc-*+'* • i [(* • 4] = • <*= + 6, ■ -it—lf + '-' + hn-it- hf. 

Evidently P (t, /<;) = 0 f or all t. Disregarding constant factors, we have further 
Pi ityh) = h • + (A — 1) hi • it — 1)^’ + • • 4- h^__i . ^ _j_ i)^*^ 

Pa if, k) = h ih — 1) • 4- (A — 1)(A — 2) b^ • • (h — 1)^' 4- • • 4- 2h/,_2 ' 

'X^^-it — h 4- 2)^*, etc. 

Paying regard to the identities P it,Jc) = P(^, A — - 1) = • * = Pit, O) ^ 0, we 
get without difficulty P(t 4- 1, A 4* 1) — P(^, A 4- 1) = 0 for all t. Hence P(t, A 4-1) 
must reduce to a constant, say Cq. In the same way we get P(t 4- 1, A 4“ 2) — 

— Pity A 4- 2) = Cl * Pity k + 1), which shows that P(t, A 4- 2) is linear in respect 
to ty and by induction we find that Pit,k 4- s) is a polynomial in t of order s — 1. 
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ElemeEtary transformations show that Cq — P (.t, k 4- 1) — (t — h) • P{t, k) == 
= P^it, k) for all t. Similarly, Pit, k +2) — it — /i + 1) • P| it, k) = it, k) is 
linear in respect of t, and repeating the procedure we find that Ps it, k) is of order 
s — 1. 

We conclude that S is linearly composed of a finite number of terms of type 
Sic = H t^^ • ixn where k ^ 0. Hence | | ^ • 2 t^ • — < 

t=i 

< A • — (1 — where A is independent of t. It is seen that for 

any finite k'^O the term [ Sk | is of an order ^ Si, which completes the proof. 

Tlie above analysis has shown that if (251) has no root falling 
outside the unit circle, certain well-defined linear operations on the 
moving average given by (249) will yield the primary process 
If |£r^•| 1 for all Jc, the sequence ib) and the process {^(6} will be 

termed regular. 

In order to prepare the analysis of non-regular sequences ib), 
let (249) be an arbitrary process of moving averages, and let its 
autocorrelation coefficients be represented by Paying regard to 
(250), we obtain the fundamental identity 

(256) ^ ix^ + H + bh—ix + bj^ ibhX^^ + bh—ix^'~'^ + • • • + + 1)= 

= rhX^^ + r/i-i 4 h 4- + . . . 4- m-ix 4- n. 

Eeplacing here x by formula (121) shows that we get W' ix). 

Until further notice, we shall again assume that (249) is regular. 
Denoting as before the zeros of the first factor by Xk, it is seen 
that the zeros of the second factor are given by Hxk. Consequently, 
the zeros of the right member may be denoted x^, .. ., x^n-, 

where 

Xk = 0 < I iTi I < I jTg I < • • ■ :< I iTA I < 1 < I Xh+i | ^ * ■ * < | X^h\- 

Further, it follows that if there exists another sequence, say 
(1, •) si^ch that the corresponding moving average will have 

autocorrelation coefficients coinciding with those of (249), then one 
zero of the polynomial x^ 4- 4 - ... 4 - ^ 4 - jU) equal 

either x^ or l/%, another either X 2 or l/x^, eto. Evidently, there 
are at most 2^ real sequences of this type, say (&[°^), . . ., (J^^^^). Letting 
(&(°)) represent the regular sequence started from, a short reflection 
reveals that aU the other sequences in the group are non-regular. 

If in this way a group (6^*^) of sequences is attached to every 
regular sequence (6^^^), it is clear that an arbitrary sequence (6f) = 
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= (1, bi, . bh) will belong to one, and only one, of these groups. 
It should further be observed that this group may be constructed 
with the use of only the sequence (&|). Similarly, a group will 
evidently be uniquely determined by the corresponding sequence of 
autocorrelation coefficients. 

Thus prepared, let be a purely random process, and let 

be a group of finite sequences as defined above. Writing 

( 257 ) = 1 + + [bw + ■ ■ • + \b^. 

let, correspondingly, a group i^)}) of. moving averages 

be defined by 

Jg-lo) 

(258) (t] [V (0 + 72 - 1) + • • • + 72 (f -- h)]. 

Marking the symbols referring to different processes in a group by 
corresponding indices, it follows from the construction of the group 
that 

(259) = 7c==0, ±1, ±2, .... 

.Further, the group will contain one, and only one regular process, 
viz. This will be assumed to be given by (249), and be 

alternatively denoted by {C®}. 

If all roots of (251) are falling on the periphery of the unit 
circle, the group will evidently contain only the process == 

== {^(0}. Otherwise the group will include more than one process, 
at most 2^ in number. Furthermore, it should be observed that an 
equivalent construction of the group is possible on the basis 
of the primary process { 72 ®} and the characteristics (259) common 
to the processes in the group. 

Eeferring to the autoregression analysis as set forth in section 
19, it is seen that the coefficients in the formulae for the residuals 
77 ) involve only the autocorrelation coefficients and the disper- 
sion of the process under analysis. Let this fact be combined with 
the above observation that the limit residual lim 72 (^; n) of a regular 

n — ► 00 

process of moving averages may be obtained directly, viz. either 
from (200) or from (255). We conclude that the autoregression re- 
siduals, say { 72 ^^*^ (6}, of the non-regular processes in a group 72 )}) 

will be given by corresponding linear operations, and that these 
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expressions will involve exactly the same coefficients (a) as in the 
ease of the regular average. 

Considering in the first place the case when all the roots of the 
equation (251) obtained from the regular process (249) in the group 
are falling in the interior of the unit circle, the above argument 
yields 

(260) + + + + 

d- {ah + ah— 1 + • • + If) 

+ icLh+i + cth 'bf + • • 4* hf) ' rjii — h — 1) + • • • . 

Since the residual cannot reduce to ^(0. Further, 

were rl''^Kt) a finite moving average, it would follow that 
for some h> 0. As is non-autocorrelated this is impossible, so 
as given by (260) must be an infinite moving average (cf. sec- 
tion 15 y). ISTow, paying regard to the relations (250) and (252 — 254), 
an elementary transformation will verify that == 0 for ^ + 0. 

On the other hand, if at least one root of (251) is lying on the 
periphery of the unit circle, the representation (255) gives 


{^{i)(t)}=lim 1)} 4* a!/) {(;^^Kt- 2)} 4- •••]. 

A :— .00 

By construction, the coefficients connected with (255) are such 
that lim == hn> Keeping this in mind, we conclude from 

fc — *00 

(96) and (97) that lim — an. Now, the relation (255) implies that 

fc— CO 

we may express the ^’s in terms of the t^’s, and that the resulting 
sum, say — 1c)} • cf, has a limit equalling the sum of the limits 

■ lim Since and {^} have identical autocorrela- 

tion properties, we may in the above relation perform the same 
procedure on It follows that the representation (260) holds 

even in this case, without a limit passage being required. 

Having now illustrated theorem 6 by means of a process of mov- 
ing averages, the representation secured by theorem 7 is readily 
obtained directly. In fact, since the coefficients ® in the canonical 
formula in theorem 7 are derived solely from the dispersion and 
the autocorrelation coefficients of the process considered, we con- 
clude that they must be identical for all processes in a group 
as defined above. Thus we have (cf . also (258)) 
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(261) (e} = IV"’ (8} + K (^ - 1)} + • • • + 6a (t - h)} 

for all processes in the group (^**1). It should be noticed that (254) 
gives the coefficients (b) in terms of the sequences to) and (r) char- 
acteristic to the group. 

Summing up the main results, the analysis gives the following 
answers to the questions set forth on p. 123. The dispersion and 
the autocorrelation coefficients of a process of moving averages 
being given, there will in general exist a well-defined group of 
moving averages with the same characteristics, and with the same 
primary process. These moving averages are limited in number, 
and it is possible to construct the corresponding sequences of coef- 
ficients ib) by means of the characteristics prescribed. Alternatively, 
if one sequence (6) in a group is known, the others are uniquely 
determined. — Among the moving averages in a group it is pos- 
sible to distinguish one, the regular process, which alone has the 
property that the primary process will be given either by a rela- 
tion (200) or by a limit relation of the same type. The coefficients 
to) in these representations are uniquely determined by the coeffi- 
cients (J) of the regular process. — For the non-regular processes 
in a group, there exists no relation of type (200) yielding the prim- 
ary process. However, inserting a non-regular process in the 

representation of the primary process in terms of the regular pro- 
cess belonging to the same group, we get the non-autocorrelated 
residual of secured by theorem 6. According to theorem 7, 

the process niay be looked upon as a moving average of its 

residual. The coefficients of this average are nothing else than 
the coefficients (5) of the regular process in the group around 

In connexion with the applications in section 31, we shall derive 
a linear relation of more general type than (200), which yields the 
primary process in terms of a non-regular moving average. 

The autoregression analysis has given us no tool for distinguish- 
ing between the different moving averages in a group. If more 
precise information is required, other methods have to be applied, 
e. g. an analysis of conditioned variables (cf. p. 164). Such lines of 
research falling outside of the scope of the present study, this section 
will be terminated by some explicit examples of the previous analysis, 

Ilhistrations. 1) Let tlie autocorrelation coefficients of a moving average be 
given by = rn=^ for n>l. 

9 “-- 38387 . H, Wold, 
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The relation (256) reads in this case 

‘5(cc — 1) ( — a; + l)= — ’6x^-\-x — ‘5. 

We conclude that the group (5^*^) contains only one sequence, viz. (1, —1). Thus, 
the general formula of type (249) for a moving average with the given autocorrela- 
tion coefficients reads 

(262) 

The equation (251) has only one root, and this falls on the unit circle. 

Hence, in order to express rj in terms of we have to apply the limit relation (255). 
Taking, for instance, — 10“^ we get 

(263) {7j(f)}=lim [{^(i)}+(l — 10-*0- ICC— 1)} +(1 — 10-*=)^ • {^(f-2)}+ • 

Table 2 contains a model series section of a process of type (262). Taking h—l, 
and applying (263) to the last element hut one in this section, it is seen that we get 
the following approximation to the last element in table 1, (2), 

-2+2 (-9) —(-9)® + (-9)* —(-9)® —('9)® + (’9)’' + ('9)® + • • ■ • 

A computation of this sum has given — '97, which is fairly close to the exact value, 
i. e. — 1. In point of principle, it is possible to reconstruct in this way the model 
series (orf ^) on the basis of the model series {§t)- 

2) Let the coefficients (&) of a moving average (249) be given by (1, 2). 

Forming the relation (256), we get 

*2 (a; + 2) (2 a; + 1) = ‘4 + a; + *4, 

and conclude that ri=*4, ^ 7^=0 for 7^>l. The group (?>^^^) is seen to consist of 
(1, ’5) and (1, 2), the former sequence being the regular one. Now, the system (97) 
gives — — ’5, while the relations corresponding to (96) show that a/i=(— *5)^h Con- 
sidering the general formula for the regular process, 

{C(i)} = h(^)} + -6 {»?(!)- 1)K 

it follows that 

in full agreement with (200). Observing that 1*25, and that (JT^^^)^— 5, for- 
mula (258) gives for the remaining process in the group (^^^^ 

(264) = 1)}. 

According to the general analysis, the residual secured by theorem 6 is ob- 

tained by replacing ^ by in (200). As is readily verified, the corresponding 
representation of type (260) reads 
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We obtain, in analogy with (264), and in harmony with the general formula (261) 

{7^^^^(^)} + *5 1)} = ’5 1)}. 

It can he easily verified that D^{rj^^\t))=D‘{rj(t)), and that the process is 

non-anto correlated, i. e. that rk(rj^^^)~0 for A:+0. 

3) N ext, let r J — l’ ; rjc=0 for A; >* 2. 

The relation (256) reads in this case 

(x^ + 'Bx — ■5)( — '6x^ + '6x-{-l )== — + + lx — 

Paying regard to the identity 03^4- ‘5 a; — ‘5=(cc — '5)(cc + l), a short calculation will 
show that the group (&^*^) consists of two sequences, viz. (1, ‘5, — ‘5), which is the 
regular one, and (1, — 1, — 2). The general formula for a corresponding regular 
process reads 

{^(t)} = {7y(6} + ’5 {r]it'—l)}—'6{r]{t—2)}. 

We find without difficulty 

GA;=i(i-)^' + |( — 1)^. 

The non-regular process is given by 

= -5 {72(t)} -‘5 {92(t- 1)} - {9^(t-2)} = 

= {9?'‘’(i)} + '6{V“(^— 1)} — ■5{V“(^-2)}- 

Formula (260) gives for the non-autocorrelated residual 

1)— |97(i— 2)— 3) 

4) A few remarks in connexion with the illustrations concluding Chapter II will 
be sufficient to show how the general formulae given in the present chapter will 
work in the case of a normal process of moving averages. The developments below 
cover both processes of linear autoregression and processes of moving averages. 

The matrix of the infinite quadratic form appearing in the characteristic function 
of a non-autocorrelated normal process {r]} with dispersion 1 is nothing else than 
the unit matrix. Now, keeping in mind the substitution procedure indicated on 
p. 90, the relations (210) will verify the product formula already given on p. 91, 


1 

h h 
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0 . 
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0 
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V,’ ■ 






. . . 


/ • 


. . . 


which yields the matrix belonging to a normal process of finite or infinite moving 
averages. The previous illustrations exemplify the fact that there are, in general, 
several sequences (5) giving rise to the same matrix in the right member. 
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The general inversion formula (200) implies that 
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The verification is immediate. In fact, paying regard to (208) and (210) we obtain 
in the first place 


1 rg . . 


10 0..' 


X^y X^ * X^ • Z?g, . . 

1 ■ ■ 


1 0 . . 



0 , . . 

ri 1 . . 


C^g 1 . . 


0,0 , z® , . . 



> 




Multiplying the right member with the first factor in (265), and keeping in mind the 
relations (201), we arrive at the right member of (265). 

5) Let Pi be the multiplicity of a real root or of a conjugate complex root-pair of 
(251). Then the number of averages round (249) is II{pi^l), the index i running 

i 

over those roots and root-pairs which are 4= 1 in modulus. 



CHAPTER IV. 

On tlie application of some stationary schemes. 

27. Preliminary remarks. Disposition. 

In applied time series analysis, the chief problem is to find a 
hypothetical scheme which from a theoretical viewpoint is appro- 
priate to the phenomenon considered, and gives a satisfactory fit 
to the observational data. Another desideratum is, of course, that 
the hypothesis be as simple as possible. 

As shown in detail in sections 15 and 16, the general stationary 
process embraces all of the hypotheses about time series surveyed 
in Chapter I. The wide scope of the stationary process is due to 
the fact that the restrictions are reduced to a minimum: Besides 
the indispensable postulate that the probability laws must not 
contradict themselves (see (53) — (54)), the only further assumption 
is that time itself will not influence these probability laws (see 
(55)). In other words, time is thought of as a passive medium; 
roughly speaking, this means that any prognosis based upon the 
past development will depend only on this same development — 
i.e. supposing that the same development had taken place with a 
constant lag, the corresponding forecast would not differ, apart 
from the displacement in time. 

It stands to reason that the assumption of stationarity is leg- 
itimate in the most varied fields of scientific research. Eestricting 
ourselves in such cases to equidistant time points, we have at our 
disposal all of the schemes falling under the discrete stationary 
process, and this is a larger class than the continuous stationary 
process (cf. p. 70 f). 

From the viewpoint of the theory of probabilities, the purely 
random process is the simplest type case of a stationary scheme. 
In sections 15 and 16, certain other type cases were constructed 
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on the basis of the purely random process, and the scheme of 
hidden periodicities was interpreted as a stationary process. The 
special theoretical models mentioned are rather simple, in as mnch 
as an adequate description of their structure is possible by means 
of linear methods, such as the periodogram analysis and the analy- 
sis of the graph of autocorrelation coefficients. 

The present chapter is reserved for some applications of the 
above-mentioned simple schemes, in particular the schemes of linear 
regression. It will be seen that the results obtained are rather 
promising. We advance also that certain points in the applications 
will give rise to theoretical discussions • — the analysis in the pre- 
vious chapters was chiefly concerned with the structural properties 
of the hypothetical models, while questions bearing upon their 
application were touched upon only incidentally. 

Of course, in applying different hypothetical schemes to observa- 
tional data, different methods are required. 

In the search for a hypothetical model suitable to a stationary 
phenomenon, the construction of an empirical periodogram is a 
classical method of fundamental importance. A careful periodo- 
gram construction is a safe method for discovering hidden periodici- 
ties if such are really present. On the other hand, approximate 
methods often involve definite dangers. For instance, the Bruns- 
Oppenheim method (see section 5) fails as often as the periodic 
elements are covered by a random component. The bias in ques- 
tion, which was already observed by J. I. Craig (1916), is of 
interest also in view of other methods of analysis. For this reason, 
the »Craig eflfect» will be examined in some detail. This is done 
in the next section. 

As shown in detail in Chapter III, periodogram analysis is an 
inadequate method of research in the cases of linear autoregression 
and of moving averages. The present study being focussed on these 
schemes, we have to look for appropriate substitutes for the periodo- 
gram method. In the memoir where G. XJ. Yule (1927) introduces 
the scheme of linear autoregression, an empirical parallel to the 
autoregression analysis as developed in section 19 forms the leading 
method of research. This method yields a first substitute for the 
periodogram construction. Next, as emphasized by Sir G. Walker 
(1931), the autocorrelation coefficients behave quite diJSferently in 
the schemes of hidden periodicities and linear autoregression. Hence, 
the graph of the serial coefficients will yield important information 
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about the nature of the phenomenon considered. The analysis 
in section 26 shows that the scheme of moving averages, too, 
presents a characteristic graph of autocorrelation coefficients, a 
circumstance which augments the importance of the method proposed 
by Walkeb. For the sake of brevity in writing, the graphs of 
serial and autocorrelation coefficients will be termed eorrelograms 
[empirical and hypothetical respectively). 

In the following, the methods proposed by G. U. Yule (1927) 
and Sib Gr. Walbee (1931) will be carried further on the basis of 
the theoretical investigations in the previous chapters, and used 
in the applications to empirical data. A critical survey of the 
original methods of Yule and Walker will be given in section 29. 
In section 30 follow a summary and a critical examination of the 
modified methods. 

The two sections concluding the present study are reserved for 
applications of the scheme of moving averages and the scheme of 
linear autoregression. Both these schemes of linear regression may 
be attached directly to familiar lines of time series analysis. 
Eecurring to this point later, especial attention will be drawn to 
the different type of forecast yielded by these schemes as compared 
with the scheme of hidden periodicities. While the forecasts 
delivered by the latter scheme cover an infinite future, the former 
schemes will yield efficient forecasts only over a limited period of 
time. However, this limitation is outweighed by the greater effi- 
ciency in the short time forecasts yielded by the schemes of linear 
regression. 

Finally, in discussing the applications, certain generalizations of 
the schemes of linear regression will be touched upon. 


28. On the Craig effect 

In section 5 we have surveyed a few methods for separating the 
individual components in a sum of harmonics (17). The classical 
method being the construction of a periodogram, the short cut 
indicated by S. Oppenheim: (1909) (see p. 18) is based on the 
difference relations satisfied by the function (17). 

The scheme of hidden periodicities consists of a composed har- 
monic on which a purely random component is additively superposed. 


136 A5^A.LYSIS OF STATIONARY TIME SERIES [IV 28 

Even in this case a periodogram will point out the frequencies of 
the individual harmonics. But as emphasized bj J. I. Craig (1916), 
the Oppenheim method wiU be biassed by the random component. 
Proceeding to an examination of this bias, which will be termed 
the » Craig effect», it will be sufficient for our purpose to consider 
a scheme of simple structure. In doing this, we shall regard the 
composed harmonic as a sample series of a singular process. This 
device, which will not affect the proof, is used merely in order to 
illustrate the connexion between the scheme of hidden periodicities 
as defined in section 8 and the process of hidden periodicities as 
defined in section 15i2. 

Let [xjj (0} stand for a singular process satisfying the relation 

1/; (^ - 1) + F • t// (0 = 0, 0 < 7^: < 2. 

Let {t] (fl} be purely random, and independent of {ip{t)}. Let further, 
[r] (6} have a finite dispersion and a vanishing mean, and let a 
process of hidden periodicities be defined by 

(266) = 

Denoting by xpiit, t—1, . t — n) = [ipi (t), ipi {t—l), . ipi {t — n)] 
a sample series section of the process {')/'«)}, we have 

+ P ipiit-s) = JE[J^ ^ + if)] = 0. 

Now, let P be unknown. The value delivered by the Oppenheim 
method is that minimizing the expression 

V + Ic^ \pt it - 5)]l 

n~ls=.i 

A short reduction leads to the following value which is obviously 
unbiassed, 

, n—l n—1 

(267) 2 xpi(t-s )- 2 ^|a-s) = 

S™1 s~l 

According to (31), the frequency I of the harmonic 1 //^ (7) is given by 

(268) 1 = 2 arc sin Jc/2. 



IV 28] 


THE CRAIG EFFECT 


137 


Let it next be assumed that a sample series ^ — 1, . ^ — w) 
is g’iven, and that we know neither Tc^ nor the sample series 'ipi 
and r]i. The assumption concerning* an additive random element 
corresponds to the actual situation when seeking for a period in 
an empirical time series. Applying now the Oppenhbim method, 
we must minimize 

(269) ■ "s' [J^ {t~s-l) + ¥- (t - , 

n ~ 1 

which gives (cf. (267)) 

n — 1 n —1 

(270) ¥=— S {t-s)- — S II {t - 5). 

S =1 S =1 

This expression depends on the actual path of the sample series 
— 1, . ^ but a sufficient approximation's delivered by 

the S^-value minimizing the expression 

(271) Ei [z/“ ^ (e]2 + [z/2 - 1) + ¥ 7 ] (0]^ 

1 ^ 

where we have written Ei[f[xlj]] for lim r* S /['ipiit — s)]. 

n— -*05 77 1 s=.i 

A short calculation shows that this approximation reads 

(272) ¥ - [- Ei + (0] / [Ei it) ■{- Erf it)]. 

The difference between on one hand (270) and (272) and on the 

other hand the unbiassed formula (267) gives rise to the Craig 
effect. Formula (272) shows that this effect will depend on the chance- 
determined amplitude of the harmonic constituting t — 1, . t—n). 

It is seen that the value given by (272) equals ¥ when, and 
only when, P = 2, i. e. when xpit) has the period 4 time units. 
A brief reflection shows further that the Oppenheim method will 
under-estimate the period if this is above 4 time units, while the 

reverse will be true if the period is below 4 time units. Moreover, 

the larger the variance of the random component t], the larger is 
the Craig effect. 

Illustrations. Two simple examples of the Craig effect will be given on the basis 
of the model series (13 T) and (13?) presented in table 4. Applying the relations 
(270), (268) and (18), the following results were obtained, the sums running from 
2 to 3^ = 999. 
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S A ■ A-i 


S[A] 

¥ 

I 

P 


2 


Model series 
. — 4041 
1112 
3*6340 
144*8° 
2*486 


Model series 
— 5568 
1711 
3*2542 
128*8° 
2*795 


Tlie correct value of tlie period being 2 units in both the time series, the Ckaig 
effect is seen to be substantial. The values obtained are in good agreement with 
the approximate formula (272). In fact, observing that in each of the two cases 
Ei [ip {t) ' ^^xpit — 1)] == — 4, and that the variances of the random components are 
respectively *2 and *6, we get in the first case 7i;^'^4*4/l‘2 = 3*667, and in the second 
5* 2/1 *6 — 3*25. In full agreement with the remarks attached to formula 
(272), the period is over-estimated, and the Craig effect is larger in the second model 
series than in the first. 


In the above analysis of the scheme (266) we have examined the 
Oppenheim method for determining the parameter k in an approach 
of type or - k) • - V) - - 2). We 

shall next apply the same method in starting from a more general 
approach, viz. 

(273) - 1) + ^^2 * - 2). 

Letting as before — represent a sample series 

section connected wdth the scheme (266) of hidden periodicities, we 
must in the present case minimize (cf. (269)) 

(274) — 1— S [^iit — s) — a, • ^iit — s — 1) — • g^(^ — s — 2)f 

I'l 1 5=0 

in respect of and a^. Now, using an approximation of the same 
type as in (272), and paying regard to (46), we get 

(275) — — ^ 0, --^ 0, 

having written ft = cos X kl (1 + 67|). Here I is the freqnency of a 
sample series connected with the singular process {i^(^)}, and 
di — (rj) / Dlixp)^ where Biixp) is the dispersion in the sample series 
'ipiitjt'^ l, ...). Solving (275), we obtain 
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(276) 


sin 


n + 


sin^ il + 4 rZ| 4- 4 dt 


cos 1] 


sin^ 1—2 d j • co s 21 ^ 
sin^ A 4- 4 + 4 ”” 


The minimizing of (274) is obviously analogous to that phase of 
the autoregression analysis described in section 19, where ^(6 is 
linearly approximated by — 1) and ^ (i^ — 2). It is in view of 
this analogy that the abov^e formulae are of interest. Now, if 
D(?j)= 0, we obtain from (276) the coefficients % = 2 cos I and 
dg = — • 1 appearing in the identity (cf . (35)) 


= if) =5 2 cos I • 'ipit — 1) — — 2). 


It should further be observed that we have in this case cos X '-= 
= d^l 2v — d^. On the other hand, if D(? 2 ) 4 = 0 , the latter relation 
will be disturbed by a Craig effect. In fact, we get from (276) 


(277) 


cos I ^ 


(sin^ I 4- cXi) • cos X 


V (sin^ X — 2di' cos 2X) (sin^ yl 4- 4 <^1 + 4 dt) 


Approximating the period jp of "(pit) by means of the biased frequency 
X given by (277), the Craig effect is seen to be particularly large 
if sin X is small. It would serve no purpose to discuss the sign of 
the deviation or to enter into details on a singular process {xpit)} 
of general structure. 

Illustration. Considering the model series dealt with in the previous illus- 
tration, we have found by minimizing (274) — 910 = d^ • 1114 — d^ • 911 ; 930 = 
== — % * 911 4" ag ' 1114. This system gives d^ — — ’4051, %=='5036. Since 
ai~ — 2, and — 1 the method examined is completely misleading. Tormula 
(276) explains the failure — in the present case sin 2 = 0, and the resulting a- 
values will show no tendency whatever to approximate the a-values sought for. As 
is readily verified, (276) gives •“di‘^a2'^'4167. 

Sumining up the above analysis, we conclude that the Oppekheim 
method yields no adequate substitute for a periodograin construction 
in case the periodicities are covered by a random element, and it 
does not seem worth the trouble to derive modifications neutralizing 
the bias involved. 
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29. On earlier applications of the scheme of linear antoregression. 

As far as I am aware, there are only two earlier investigations 
which present direct applications of the schemes of linear regression 
to empirical data, viz. those already referred to by G. U. Yxjle 
( 1927) and Sir G. Walker (1931). These memoirs are concerned 
with sunspots and air pressure respectively, and in both cases it is 
the scheme of autoregression that is applied. An examination of 
the main lines of these investigations in the light of the previous 
analysis will be given in the present section. For the sake of 
completeness, we shall also touch upon a passage in the already 
mentioned study on expectance theory by K. Stxjmpfp (1936), where 
an empirical correlogram is dealt with by use of a method related 
to that proposed by Sir G. Walker. 

The memoir of G. U. Title (1927) starts with a discussion of a 
model series, say constructed on the basis of a relation of type 

(278) ^ 5—1 + 5—2 = % — 2 < ^ < 2, 

the purely random series fjt being given by dice- throwing. The con- 
stant h is chosen in the interval (— 2, 2), which implies that the 
roots of the characteristic equation of (278) are complex, and of 
modulus unity. Thus it follows from the general analysis in section 
22 that a process {^(6} corresponding to (278) is non-stationary. 
However, the evolutive tendency is rather weak, and the 300 ele- 
ments constituting Yule’s model series actually present fluctuations 
of a stationary appearance. 

The parameter Jc in the model series 5 being chosen so as to 
correspond to a period of 10 time units, Yule lays stress upon the 
structural resemblance between his model series and the yearly 
index of sunspots. Pursuing this suggestion in the later sections 
of his memoir, he gives two different methods for a refined analysis 
of the structure of the index. Yule works on the A. Wolfer 
index 1751 — 1923. 

In his first approach. Yule starts from the hypothesis that the 
sunspot index, say 5? satisfies a relation of type (278). In order 
to determine J;, he minimizes the sum of the squared »disturbances» 
fjt. Interpreting the relation (278) as ruling the movement of a 
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pendulum subjected to random shocks, Yule derives the intrinsic 
period of the pendulum which would correspond to the /revalue 
obtained. The period thus derived being too short, viz. jo=10’08 
years, he finds that the hypothesis (278) gives a better value, = 11 '03 
years, when applied after graduating the sunspot numbers. 

Having thus assumed a linear regression (278) between 
and Yule applies a graphic test of this approach. Eeferring 
to the graphs given on p. 277, he says on the same page: »On the 
whole, however, divergence from linearity does not look as if it 
would be a serious trouble ». 

In his second approach. Yule starts from the relation 

(279) == 

Proceeding as in the case (278), he determines the parameters [a) 
by minimizing S fjt, and interprets the results by the use of the 
analogy with a pendulum. The values found for and corres- 
pond to a damped intrinsic oscillation. The ungraduated index 
gives the period ^ = 10' 600 years, while the graduated index as 
before gives a better value, ^ = 11 ’164 years. 

The graduated index gives rise to a smaller variance in the 
disturbances fjt than the ungraduated index. Dividing the variance 
of fjt by the variance of the sunspot index under analysis, the 
approach (278) gives *243 for the ungraduated index, and *115 for 
the graduated one. The corresponding values in the approach (279) 
are *198 and *102 respectively. 

In applying generalized hypotheses of type (278) and (279), Yule 
finds that the introduction of more parameters does not bring on 
a marked decrease in the variance of the disturbances. In other 
words, the experiments »fail to suggest the presence of any period 
other than the fundamental, a conclusion entirely in accord with 
the work of Laemoe and Yam:aga» (p. 295). 

In a summary, Yule suggests that the sunspot numbers » should 
be regarded as analogous to the data that would be given by 
observations of a disturbed periodic movement, such as that of a 
pendulum subjected to successive small random impulses» (p. 294). 
Let us discuss this hypothesis in the light of the previous analysis. 

As already pointed out, the approach (278) does not correspond 
to a stationary process, for if the series fjt were purely random, the 
secondary model series would present oscillations increasing in 
amplitude with time, i. e. be evolutive. In view of this observation, 
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it is not surprising that the disturbances fit, as calculated from 
(278) on the basis of the >?;-value previously determined, forma 
series of non-random character. Quoting* Yule, the series f]t shows 
»a tendency for positive disturbances during the approach to the 
maximum of the sunspot numbers, negative during the approach 
to minimum» (p. 294 f.). 

We have already seen that the approach (279) gives rise to a 
somewhat smaller variance in the disturbances fjt. However, having 
introduced a second parameter, the slight reduction in the variance 
of fjt does not give a sufficient reason for preferring (279) over 
(278). On the other hand, (279) corresponds to a proper stationary 
process, which is a circumstance speaking in favour of this 
approach. 

In itself, the outcome of significant constants % and does 
not imply that the approach (279) is adequate. Without further 
evidence, we cannot even conclude that (279) is better than the 
hypothesis of a strictly periodic component in the sunspot index. 
In fact, our analysis in the previous section has shown that an 
autoregression analysis will give rise to nomvanishing coeifficients 
(a) also in the case of hidden periodicities. It is interesting to 
notice that even the effect of the graduation — the increase in the 
period — might be explained by ‘assuming the index to be ruled 
by a scheme of hidden periodicities (cf. p. 137). However, a 
sufficient reason for rejecting the latter hypothesis is that the 
deduction of a strictly* periodic component actually leaves an 
» error » with a dispersion substantially above that of the disturbances 
obtained from (279). In fact, the assumption of one harmonic 
component in the sunspot index would explain at most 28% of the 
variance in the index (see e. g. K. Stumtee (1937), p. 126). On 
the other hand, we have seen that the approach (279), which 
contains only two parameters, is able to explain at least 80 % of 
the variance in question. In this connexion it is rather interesting 
to notice that the periodogram of the sunspot index given by 
Stumper (1. c.) bears a certain resemblance to our fig. 4 (p. 116), 
and thus agrees with the hypothesis of linear autoregression. (Cf. 
also the remarks attached to (131)). 

The values found by Yule for the parameters in (279) are 
% = — 1 '34254, ^2 = ‘65504 for the ungraduated, and ^ 1 =— 1’51527, 
^ 2 ==’ 80245 for the graduated sunspot index. Using the analogy 
of a swinging pendulum, these values correspond to a rather heavy 
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damping* — in the case = * 80245, the amplitude of a swing 
would be reduced to 29 % in the duration of one period (see 
G. U. Tule (1927), p. 282). With this heavy damping, a purely 
random series of impulses would not be likely to produce such 
large amplitudes as in the sunspot fluctuations. In full agreement 
with this argument, which in section 32 will be developed as a 
test of the scheme of linear autoregression, the disturbances cal- 
culated from (279) on the basis of the sunspot index present 
variations of a non-random type, in as much as they »do occur 
just in the kind of way that would be necessary to maintain a 
damped vibrations (p. 286). 

Summing up the above discussion, we have seen that in replacing 
the approach of strict periodicity by a hypothesis containing an 
acting random element, G. tJ. Tule (1927) obtains a substantially 
better fit to the sunspot data. In the terminology of the present 
study, the approach (279) as applied to the ungraduated index 
corresponds to a scheme of linear autoregression. On the other 
hand, as applied to the graduated index, it is obvious that the 
approach (279) proximately corresponds to the assumption that the 
ungraduated index is ruled by a scheme consisting in a purely 
random process independent of and superposed on a process of 
linear autoregression. 

As mentioned by Tule, the disturbances calculated from (279) 
present a certain systematic variation. This non-random behaviour, 
which seems to be conditioned by the small value found for in 

(279) , remains unexplained by the hypothesis of linear autoregression. 
In view of this circumstance, it seems to me as if the sunspot 
index calls for further investigation. Perhaps the methods developed 
in section 32 would yield a scheme fitting the data better. But 
it is also possible that more satisfactory results would be obtained 
in an approach involving a non-linear function of the index. Since 
to pursue these suggestions falls outside the program of this survey, 
we shall end our discussion of the Tule memoir. 

Sir G. Walker (1931) follows up the Tule approach (279), and 
studies an autoregression relation of type 

(280) + % &--1 d' • * • + == 

As mentioned in section 24, Walker finds that the autocorrelation 
coefficients corresponding to (280) satisfy the relations (cf. p. 104). 
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(281) n + a^n^i -1 {- dhTh-h = 0, i > /^. 

Assnmiiig' that the roots of the characteristic equation (34) are 
different, he further gives n- as the general solution of (32). 

We are now in a position to examine Walkee’s methods for 
applying (280) to empirical data. The basic idea is simply to 
compute the serial coefficients fjt, and to determine the constants 
an so that the relations (281) will be approximately satisfied when 
replacing Vk by n. 

SiE G. Walkee works on air pressure data from Port Darwin 
1880 — 1925, taking the quarter of a year for time unit. The graph 
of serial coefficients — in our terminology the correlogram — 
ranges from i = 1 to = 147 quarters (p. 531). The graph shows 
a rapid decrease from fq==‘76 to fg^O. For larger ^-values, the 
correlogram presents fluctuations with rather small amplitudes. In 
fact, up to I?: = 100 all the coefficients n- are less than *3 in 
modulus. 

SiE G. Walkee finds that in the interval 0 ^ ^ 40 a fairly good 

approximation to the correlogram is yielded by the function 

(282) n = *19C96)^* cos nIcKS + •l5C98y^ + *66C7m 

This function, which is seen to involve a damped harmonic with 
period p — 12 quarters, satisfies the difference equation 

Tjc — 3*35 Tk—i + 4*43 n-_2 — 2*71 rk^s + * 64 Tk—i = 0. 

Concluding inversely from (281) on (280), Walkee finally arrives 
at the representation 

(283) C*-3'355-i + 4-435_2-2-71C;-3 + •645-1 = ^*. 

Proceeding to an examination of Walkee’s methods, we observe 
in the first place that an unconditioned conclusion from (281) on 

(280) is not permitted. In fact, the autocorrelation coefficients 
corresponding to the relation (280) satisfy not only the relations 

(281) but also (222 — 223), the latter relations not having been 
observed by Walkee. It is seen that the coefficients rg, . .. n— i 
win be uniquely determined by the system (222) in terms of the 
coefficients an* In other words, the coefficients in (280) determine 
not only the periods and the damping factors of the individual 
components in the expression (33), but also the coefficients of the 
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components. Thus, without anything further we cannot be sure 
that the autocorrelation coefficients corresponding to the approach 

(283) will be given by the graduating expression (282). On the 

contraiy, the system (222) corresponding to (283) actually gives 
rise to autocorrelation coefficients which are substantially different 
from (282). Having read off the values = *76, = *55, rg = ‘35 

from the graph of (282) given by Walker (p. 528), I have found 
from (222) the values = 7\2 = ‘12, 7'^ — '4=3 for the autocorre- 

lation coefficients belonging to the approach (283). 

In view of the oversight pointed out, it is not surprising that 
the relation (283) gives rise to a larger dispersion in the disturbances 
fji than in the air pressure data As a matter of fact, 1 have 
found Dif]) = 2*4 H®, while the result DiT)) — ’28H© obtained by 
Walker (p. 530) is based on an incorrect use of the relation 
mentioned in a note in section 25 (cf. (317)). We conclude that if 
the approach (283) is to be applied to the air pressure data, the 
coefficients must be modified. Having stated this, it is rather 
interesting that the simple approach 

(284) = 

gives a fairly good fit to the first few serial coefficients. In fact, 
according to (238) the approach (284) gives ? i = *73, r2=*53, 
r3 = *39, r4=‘28, while the air pressure serial coefficients given 
by Walker (p. 528) read f^ = ‘76, f2 = *56, f3 = ’36, f4=‘18, A 
short calculation shows further that (284) gives i) (t^) = ’ 68 Z> (^. 

Let us in conclusion attach a few remarks to the empirical 
correlogram presented by Sir G. Walker (p. 531). As already 
mentioned, the serial coefficients show rather smaU deviations from 
zero in the interval 3 < ^ < 40. On the other hand, the increase 
in amplitude for certain /(^-values > 40 might be due to the successive 
reduction in the number of correlates. Perhaps this argument is 
sufficient to explain also why the fluctuations are somewhat larger 
in that alternative variant of a correlogram given by Walker, 
where all serial coefficients are based on 77 pairs of correlates. As 
the fluctuations, furthermore, seem rather irregular and aperiodic 
— at least to my eye — it is doubtful whether it would be possible 
to improve sensibly the approach (284) by taking into account more 
distant elements 2, 3, etc. In this connexion it is rather 

interesting to notice that according to the general analysis there 
exists no process of linear autoregression having (282) for auto- 

10— 38378. i?. Wold. 
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correlation coeflPicients. Another reason for resting* satisfied ■ with 
the simple approach (284) is that the ordinates in the periodogram 
presented by Sir Gr. Walker (p. 526) are all lying on about the 
same level — this periodogram does not like that of the sunspots 
suggest a scheme of linear autoregression with a tendency to 
periodicity (cf. p. 142). At any rate, a more detailed analysis of 
the air pressure data is beyond the scope of the present survey. 

Using formula (127), K. Stumpff (1936) arrives at an expectance 
theory which generalizes the classical Schuster theory. In apply- 
ing his theory, Stumpff works on air pressure data {Potsdam V? 
1925 — 1926, equidistance 1 hour), and replaces the coefficients 
Vh in (127) by a corresponding set of graduated serial coefficients. 
Claiming that the graduated values belong to a scheme of linear 
autoregression of type 

(285) 1) -h 

Stumpff makes a mistake similar to Walker’s pointed out above. 
Proceeding as in the developments on p. 112, I have found the 
formula = [1 + i-(l for the autocorrelation 

coefficients belonging to the scheme (285), while K. Stumpff (p. 53) 
gives r/c = (1 — ^ • log j:?) • 


30, Preliminarj survey of methods. 

In this section we shall give a brief summary of the methods 
used in the later applications and a few critical remarks on the 
scope of these methods. 

As pointed out in earlier sections, a careful analj^sis of the 
structural properties of a time series requires statistical data 
covering a rather long period. On the other hand, the series must 
not change its general character in the course of the interval of 
observation, for then a stationary scheme wmuld be inadequate. 
For instance, if a trend is present in the material, it should be 
removed before starting the analysis (cf. p. 1). 

Considering a scheme (39) of hidden periodicities, and disregai-ding 
the value = 1, the correlogram r* consists of super|)osed harmonics 
such that the periods of the individual components equal those in 
the time series considered (cf. (46)). In the two type cases of linear 
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regression, on the other hand, the correlogram has the horizontal 
axis for asymptote. In fact, in the scheme of linear autoregression 
the correlogram ru is a function (33) such that the roots of the 
characteristic equation (34) are of modulus less than unity, and in 
the scheme of moving averages all the autocorrelation coefficients 
are zero beyond a certain ^-value (cf. also (177)). 

For the sake of concreteness, we show below three hypothetical 
correlograms exemplifying the type cases considered. 



Fig. 6. Correlograms ilhistrating the schemes of hidden j)eriodioitie8 (thin line), 
linear autoregression (broken line), and moving averages (thick line). 


The correlograms in this figure are based on the following parameters. In the 
case of hidden periodicities, the correlogram has been deriyed from formula (46), 
where we have chosen s=Oi = l; Z>^(77)=='125; The case of linear auto- 

regression is illustrated by a correlogram of type (243), having taken C = 

— The moving average correlogram, finally, has been obtained from (250), 

putting A==4; hi — 'l, &2 = ‘4, 63 = — "3, ’2 

Because of the different behaviour of the autocorrelation coef- 
ficients in the schemes mentioned, it may be expected that we would 
obtain useful suggestions by inspecting the empirical correlogram 
when searching for an adequate scheme to be applied to an ob- 
servational time series. For this reason, the construction of an 
empirical correlogram is taken as the starting point in the following 
applications. 

It should be observed that the correlogram construction by form- 
ula (13) involves a relatively small amount of numerical computa- 
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tion. No trigonometrical or other mathematical tables are required. 
Another definite advantage is that the correlogram is obtained 
directly from the statistical data, withont any preceding prepara- 
tion of the material. Accordingly, the empirical correlogram seems 
particularly well snited for serving as a first indicator of which 
type of scheme to apply to the data. 

If the empirical correlogram suggests a scheme of hidden period- 
icities, the next step would be to construct a periodogram for a 
more detailed analysis of possible periodicities in the material under 
investigation. 

Next, if the correlogram suggests a scheme of linear autoregres- 
sion, our first problem is to find a scheme (101) such that the cor- 
responding hypothetical correlogram will fit the empirical one. 
The chief difficulty is to derive suitable values for the coefficients 
(a) — when having arrived at a set of coefficients (a), the corres- 
ponding autocorrelation coefficients will be uniquely determined by 
the system (221—2), and the residuals 'fjt by the relations (280). It 
is further a desidA’atum that these residuals be as small as pos- 
sible. Having seen above that these problems are more intricate 
than emphasized in earlier studies of the graph of serial coefficients, 
it will be found that an empirical autoregression analysis as pro- 
posed by G. TJ. Yule (1927) will be useful in this connexion. 

Finally, it may happen that the empirical correlogram will suggest 
a scheme of moving averages. As far as I know, the problem of 
fitting this scheme to observational data has not been attacked in 
earlier literature. A fundamental problem in this sphere was form- 
ulated by Prof. H. Cramer in his 1933 Course, viz. to find amov- 
ing average with a prescribed correlogram. It will be seen that 
the relation (256) gives a starting point for attacking this problem. 

Having now given a preliminary survey, the details of the methods 
will be discussed when presenting the results of their application. 
The present section will be concluded by attaching some remarks 
of general scope concerning the limitation of the methods outlined. 

In time series analysis, significance problems are extremely in- 
tricate. In dealing with serial coefficients, we have i. a. to pay 
regard to the fact that their magnitude is conditioned by the size of 
the statistical masses to which they refer. Having already touched 
upon this circumstance in section 24, the point in question will be 
taken up for a more detailed examination in appendix B. 

The following applications aim only at illustrating the qualitative 
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differences between various hypotbetical models. Consequently, all 
questions about the significance and the interpretation of the quan- 
titative results fall outside the scope of this study, and again an 
explicit warning is given against attaching importance to the num- 
erical values found for the parameters of the different models 
fitted to the observational data. 

However, even when limiting the program to an analysis of the 
qualitative structure of a phenomenon as described by a time series, 
we must be cautious when interpreting the results. Stating the 
case briefly: The results will to some extent be conditioned by the 
methods used in the analysis. Of course, in point of principle the 
situation is the same in all applications of hypothetical models to 
empirical data. Let us dwell a moment on some circumstances 
which are peculiar to time series analysis, especially as based on 
the correlogram and the autoregression methods. 

The correlogram sums up the autocorrelation properties of a time 
series, and the autoregression analysis, too, is based solely on the 
autocorrelation coefficients. We conclude that neither method is 
able to distinguish between different schemes with coincident auto- 
correlation coefficients. Having already seen examples of this when 
dealing with the scheme of moving averages (cf. section 26), further 
examples are readily obtained by using non-linear operations in the 
construction of stationary processes. For instance, letting {7]{t)} 
represent a purely random process, it is evident that 

(286) = 

will define a stationary process Assuming that jB[^j»(0] = 0, 

a short calculation will show that the process {^(8} is non-auto- 
correlated. 

Thus, having found a hypothetical scheme yielding a good fit to 
an empirical correlogram, it is perfectly possible that there are 
other schemes which yield an equally close approximation. When 
it is necessary to choose between different schemes, it may happen 
that theoretical arguments will speak in favour of one of the 
schemes. As exemplifi.ed in the applications, the schemes of linear 
regression often seem plausible from theoretical viewpoints, at least 
as a first approximation. On the other hand, a rational choice 
between different schemes may be alternatively based on an ex- 
amination of other structural properties of the time series than its 
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serial coefficients. Such lines of research, however, fall outside of 
the program of the present study. 

Another point which must be kept in mind has special reference 
to the autoregression analysis. Having in sections 19 and 20 sub- 
jected a general stationary process to an autoregression analysis, the 
investigation resulted in a canonical form for the process considered, 
viz. a decomposition in two mutually non-correlated processes, each 
of a structure readily comprehended in respect to certain fundamental 
properties. Now, even if a complete parallel to this analysis could 
be carried through when dealing with empirical data — which of 
course is impossible, one reason being the necessity of dealing with 
only a finite number of observations — it is not certain that the 
autoregression analysis would be an adequate method of research. 
In fact, in point of principle an autoregression analysis can reveal 
only linear interrelations between the elements in a time series. 
For instance, the implicit relation 

(287) = 

where {rjit)} is purely random, P[|?2® I < 1] == 1, and i | < 1, defines 
a stationary process {^(0}; a linear autoregression analysis would 
here give rise to an infinite sequence of residuals, and to a canon- 
ical representation which is more complicated than the simple rela- 
tion (287). 

The above argument shows that there is a certain risk of over- 
estimating the outcome of a linear autoregression analysis. When 
proceeding to residuals of higher order, more parameters are in- 
troduced, and it may be that a simpler, possibly non-linear approach 
would give better results. As mentioned before, the use of non- 
linear methods does not fall within the scope of this study. 


31. Some applications of the scheme of moving averages. 

In economic theory, a great deal of interest has recently been 
paid to the schemes of linear regression, However, the discussion 
of the advantages of the new ideas over the hypothesis of strict 
periodicity seems to have been carried on exclusively by general theo- 
retical argumentation, without attempting to fit the recommended 
schemes directly to observational data. Considering, in particular, 
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the scheme of moving* averages, this has not so far as I know been 
tried on empirical time series in other fields of scientific research 
either. When selecting statistical material in order to give applica- 
tions of the previous analysis, I chose economic time series for one 
reason because of the lack pointed out. In connexion with the 
account of these applications given in the sequel, we shall touch 
upon some related lines of economic research where the previous 
developments seem to yield proper tools for a deeper analysis. 

As mentioned in section 9, J. Bartels (1935) has found »quasi- 
persistent periodicity » in certain geophysical time series. A periodo- 
gram analysis here being inadequate, these series invite an ap- 
plication of the schemes of linear regression. Thanks to their at 
once flexible and simple construction, these schemes often seem 
plausible also a priori. A few arguments on this line will be 
touched upon in the sequel when discussing certain geophysical and 
other phenomena which from a theoretical viewpoint might be 
interpreted by means of the schemes of linear regression. 

The series of yearly wheat prices in Western Europe 1518 — 1869 
compiled by Sir W. Beveridge (1921) was chosen for my earliest 
experiment in applying the correlogram method. The purpose 
being to apply a stationary scheme, the analysis was concerned with 
Beveridge’s trend-free index of fluctuations (p. 449 ff.). In order 
to avoid changes in the structure of the index, the analysis was 
restricted to the last hundred data. An account of the analysis 
follows. 

Having made the inconsequential modification of reducing the 
Beveridge index by 100 units, the time series investigated is given 
in col. (2) of table 7. The first 15 serial coefficients obtained from 
this material with the use of formula (13) read as follows. 

Table 6. Se^'ial coefficients of the Beveridge ^vheat price index 

1770—1869. 


ri = *614, 

r2 == *090, 

=-156, 

^4 

-*115, 

rg = — '006, 

Tq ~ '003, 

== — ‘006, 

=-116, 

^9 =- 

-•166, 

^io~ 102, 

fu= -033, 

ri2== *084, , 

ri3= — ‘Oil, 

ri4 = 

1-H 

o 

ri5= 136. 


The correlogram based on these coefficients is shown in fig. 7. 

It is seen that is rather large, and that all of the following 
serial coefficients are lying in the interval —‘17 < n < *14, i. e. 
rather close to zero. To my eye, the correlogram definitely suggests 
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a scheine of moving averages. Accordingly, the next step in the 
analysis will be to search for a moving average with autocorrelation 
coefficients approximating the serial coefficients under investigation. 
As mentioned before, this problem is due to H. Cramer (see p. 123). 

Quite generally, the problem before us may be stated as follows. 
A set of numbers %, being given, does there exist a mov- 

ing average (249) with autocorrelation coefficients such that rk=^Uk 
for 1 ^ h? If the answer is in the affirmative, we know from 

section 26 that there in general will exist a finite group of moving 



Fig. 7. Correlogram of the Beveridge wheat price index 1770 — 1869 (thick line), 
and hypothetical correlograms corresponding to the approaches (292) and (294) 

(broken lines). 

averages with the prescribed autocorrelation coefficients, and we 
are also in possession of a direct method for determining the coef- 
ficients (&) of these moving averages. 

Paying regard to the relation (256), we conclude that if there 
exists a moving average (249) satisfying our conditions, we must 
have 

(288) u{x) = Uhx'^ + Uh-io^'^ + • ■ + l+~ + • • + ^ = 

/Y* mH—jL Mil 

kAj 

~ (x^ + 1 -j-, . . -p x + hji) {ih + ~ — ~ + * ’ + . 

\ *X) CC DCfi' 

If % is a root of the equation (a?) == 0, then 1 /a?o is another 
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root. It follows that the substitution 0 = a? + x~~^ will transform 
u(x) into a real polynomial in s of order h, say v(s). Let us write 

(289) V (z) = Vq -i +• vu—i z 4- vn. 

The successive calculation of Vq, . . from the coefficients Ui needs 
no comment. 

It is evident that if ^ is a root of ?;(^)==0, two roots of ^fe) = 0 
will be obtained from the equation 

(290) P{x, z) — x^ — zx + 1 = 0. 

The roots of this equation are given by 

P91, 

The product of the roots being unity, we conclude that unless both 
the roots are of modulus unity, one of them is situated inside, and 
the other outside the unit circle. 

Denoting conjugate complexity by an asterisk, we know that if 
^ is a complex root of v{z)=-0, another root reads z"". Further, 
if P (x, z) = 0 has the roots x and l/x, it is evident that P (x, z’^) == 0 
has the roots x'^ and 1/^*. In that case one of the real polynom- 
ials {x — Xj) (x — xi) and [x — * [x — must be a factor 

in the polynomial 

i ix) == x^ + x^^^ -1 h bh—i 00 + hi 

appearing in (288). 

In case v{z) = 0 presents a real root, say Zq, we must distinguish 
two cases. If | | > 2, both Xi and X 2 are seen to be real. Keeping 

in mind that either x — x^ or x — X 2 is a factor in b fe), we con- 
clude that this case corresponds to the real roots of the equation 

b ix) =0. 

On the other hand, if |^o l 2, we know from (291) that % and 
X 2 sure conjugate complex, and of modulus unity. The factors 
{x “ x^ and (x — X 2 ) being complex, we conclude that both of them 
must be contained in b (xX Since one zero of u (x) corresponds to 
one zero of this is impossible unless Zq is a root of even 

multiplicity of v(z) = 0. 

After these remarks, the following theorem demands no explanation. 
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Theorem 12, A necessary and sufficient condition that there 
exists a moving average (249) loith autocorrelation coefficients n- equal- 
ling Uk for 1 ^ Jc < h is that the auxiliary polynomial v (e) defi,ned 
ly (289) has no zero Zq of odd multiplicity in the real interval 
— 2 < z^< 2. 

If this condition is satisfied, the seqnences Qi) sought for will be 
given by the coefficients in the real polynomials h (x) satisfying the 
relation (288). In full agreement with the analysis in section 26, 
we conclude further from the above discussion that there are at 
most 2^ such sequences (6), and that the polynomials bix) may be 
written on the form {x ~~ x) {x — x) . . , {x — xjf where — denoting 
by ^ 1 , - 2 ' 2 , • • •, the roots of v{z) — 0 — the real or complex quantity 
x^ is a root of P{x, z) = (), and x ,2 is a root of P{x, z) = i), etc. 

Eeturning to the Beveridge index of wheat prices, we shall 
give a few applications of the method - outlined above. 

In the correlogram fk (see fig. 7), the small deviations from zero 
for lc> 1 might perhaps be looked upon as pure chance products. 
Thus we are led to investigate whether there exists a moving 
average rj (0 -{- b^pit — 1) with autocorrelation coefficient equalling 
*614. Putting /^==], and % = ‘614, and following the general 
method, we get u (x) == ‘614 a? + 1 + ‘614 and v (z) = ‘614 z + 1, 
Since the root — 1/‘614— —1*63 of v(z) — 0 is lying in the critical 
interval — 2 < ^ < 2, we conclude from theorem 12 that there 
exists no moving average with 7\= ‘614 and r^• = 0 for 7^; > 1. 

A short reflection shows that in all moving averages of type 
7] (f) b^rjit— 1) we have — ‘5 < < ‘5. It is further evident that 
there is only one average of this type such that = ‘5, viz. 

(292) ^{t)-m^r]it)-{- r]{t-l), 

Consequently, this average — the correlogram of which is shown 
in fig. 7 — will yield the closest fit to the prescribed value fq — ‘614. 
If we rest satisfied with the simple average (292), we have to 
interpret the deviations between the serial coefficients fk given on 
p. 151 and the values = ‘5, rg = — • • • = 0 as due to chance. If 

a better fit is desired, averages involving more parameters (&) must 
be used. A few examples of this will be given. 

As is readily verified, the approach % = '614, % = *090 gives 
uix') ^ *090 x^ + '6140; + 1 + '614 3:?“^ + *090u?“^, and ^;(^) = *090 -1- 

4- ‘614^ + '820. The roots of v{z) = 0 being < 0 'i=“- l‘82 and 
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^3 == 5'00, it follows from theorem 12 that there does not exist 

any moving average with the autocorrelation coefficients prescribed. 
However, only a slight modification in is sufficient to remove 
from the critical interval. In fact, expressing that a root (291) 
shaU equal — 2, we get (| — 2f = (| fq • — (1 — 2 ^2) ^ 

which gives % = fx — | == *1 14. 

The function v{is) corresponding to '614, % = '114 reads 

'114 + ‘614 ^ + *772 == 0, and we get = *— 2*000, 03 = — 3‘386. 

In order to prepare the construction of the corresponding sequences 
(W, we solve P fe, — 2) = + 2 x + 1 ~ 0, which gives the double 

root X = — 1, and P (x, — 3*386) = x^ 3*486 x + 1 = 0, which 
gives the real roots cc = — *3269, and x =^ — 3*0591. It follows 
that there exist two binomials h{x) which satisfy the conditions 
laid down, viz. == (x + 1) (a? + *3269) = x^ -i- 1*3269 x + *3269, 
and (x) = (x 1) (x + 3*0591) = x^ + 4*0591 x + 3*0591. 

Using the terminology introduced in section 26, the binomial 
iiix) gives rise to a regular moving average, viz. 

(293) (0 — m-=7] (0 + 1*3269 r]it-l) + *3269 9 ? {t- 2). 

Since there is only one more process in the group (0 of averages 
with the same correlogram as (0, it is evident that this one is 
symmetrical with ^i(0, and thus given by 

(0 ^m = *3269 7] (0 + 1*3269 7j{t—l) + 7]{t — 2). 

In full agreement with the general theory, can alternatively 
be derived from ^gCx) by multiplying the coefficients by 
taking for Kl the sum of the squared coefficients in (^r), and 
similarly for Kl — a short calculation will verify that K-^/ K^~ 
‘3269, etc. 

To check the coefficients (?;) obtained, it is sufficient to compute 
the autocorrelation coefficients of ® from the general formula 
(250). As it should be, we find r^ = *614, 9*2 = *114. 

Observing that the increase in U 2 from *090 to *114 has brought 
on a decrease in from — 1*82 to — 2*00, and an increase in ^2 
from —5*00 to — 3*06, it is seen that the parameters are very 
susceptible to variations in the initial w- values. Another example 
of this is given by the fact that a second slight increase in 
will make the roots and coincide. As follows from (291), this 
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will occur when U 2 satisfies the relation C307)" = Wg(l — 2 i.e. 
when Wg— 1-260. 

Using the new value U 2 = — ‘1260, and taking as before = 
= '614, we get vi^) = *1260 -f ‘614 0 + ‘7480. The roots of 
-y (^) = 0 read ^2 — ‘614/ ‘252 = — 2*4365, and those of 

P fei, — 2-4365) = 0 are rr == — ‘5225 and a; = — 1‘9140. Conse- 
quently, the binomial b (x) corresponding to the regular one among 
the moving averages sought for is simply {x 4- *5225)^ = ^2 
+ 1*0450 + ‘2730. Thus the regular moving average reads 

(294) (0 — m = 7] it) + 1 ‘0450 7]it 1) + ‘2730 t] it 2). 

Because of the symmetry, it follows that the moving average 
corresponding to bix)=^ix+ 1*9140)^ is given by 

(295) ^2 ® ““ '2730 rj it) 4- 1‘0450 7 ] it - 1) 7 ] it - 2). 

The group iQ in this case consists of three processes. The remain- 
ing one is obtained from b ix) — ix + ‘5225) ix 4- 1*9140) = + 

4 2‘4365rc +1. A short calculation gives for this process 

(296) Cs (t) - m = ‘5225 tj it) 4- 1‘2730 9? ~ 1) + ‘5225 rjit- 2). 

Checking the calculations, \we find that the autocorrelation coef- 
ficients of each of the processes (294) — (296) are given by == *6140, 
9*2 = *1260. The correlogram of the gi'oup (294) — (296) is shown 
in fig. 7. 

The appreciable effect of even small changes in U 2 is evident 
from the above. Comparing the regular averages (293) and (294), 
it is seen that the increase in % from *114 to *126 has caused a 
decrease in f^om 1*3269 to 1‘0450, and a decrease in from 
*3269 to *2730. 

The following example shows in detail how the method works 
when 'i;fe) = 0 has complex roots. Let us start from the values 
= *60, U 2 = ‘09, % = — *15, % == — ‘10, which are seen to 
approximate very closely the first four serial coefficients in the 
index under investigation. With but little reduction we obtain 
^ 10^ V fe) == 10 4* 15 49 105 ^ - 62. Solving v (^) = 0, 

we get^a'i = ^2 == 2*5103, -e-g == —‘9415 + *5240 ^, — 

— *9415 — ‘5240 i. Concluding from theorem 12 that there exists 
a group of moving averages with the prescribed correlogram, a 
short reflection shows that the group consists of 8 processes. 
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Solving P {x, == 0, we get cc == — *7013, and x = — 1*4259, 

while P (a?, gives x = *4966 and = 2*0137. The roots of 
P{x, — 0 were found to be — *3381 — ‘6679 i, and rr = — • *6034 + 
+ 1*1919 i. According to the discussion preceding theorem 12, the 
roots of P(^c, - 2 ' 4 ) = 0 are obtained by replacing i by -i in the roots 
of P {x, = 0. 

Writing 

P (^) = 4- *3381 - *6679 i) {x + *3381 4- *6679 i)=-x^ 

+ *6762 0? 4- *560402, 

the regular moving average will be obtained from i {x) == 
+ *7013) - *4966) ‘ B{x)=^x^ + *8809 + *3505 - *1208 a? - 

— *1952, and reads 

(297) rj {t) + *8809 « - 1) + *3505 r](t-2)- *1208 rjit-3)- 

-*1952 72(^-4). 

Squaring the coefficients, we get the sum = 1*95153. 

According to the general analysis, a second average with the 
same correlogram will be delivered by &i (x) = (cc + 1 ’4259) {x — ’4966) • 

■ B (x) = x* + 1'6055 x^ + ‘4807 + '0420 x — •3968. The sum of 

squared coefficients is = 3'967917, which gives X/Xj = •701304. 
Multiplying the coefficients in {x) by this factor, we get the 
coefficients in the corresponding moving average. This is found 
to be 

•7013 rj (0 + 1-1259 i? (iJ - 1) + '3371 - 2) + -0294 {t -3)- 

-•278312^-4). 

A third moving average in the group is seen to be obtained 
from (x + -7013) (x — 2-0137) ■ Bix). Proceeding as before, we find 
for the corresponding process 

•4966 4 ® - ‘3159 — — -8637 tj (t - 2) — -8395 — 

-•3930 72 a - 4). 

In the same way, the polynomial (x + 1 42o9) (.x 2 0137) ■ B (x) 

yields a fourth process in the group, viz. 

•3483 i2(O+-O3O8 72tf—l) — '9432i2(:i^-2)-‘7909 72(^ — 3) — 

- -5604 72 « - 4). 
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The four remainingf averages in the group considered correspond 
to the complex roots x" — *6034 ± 1*1919 i of u{x) = 0. Due to 
the symmetry, these processes may be obtained directly from the 
four processes above by reversing the order of the coefficients. 
For instance, the regular average (297) gives 

— *1952 7] (0 — *1208 7]it'-l) + ‘3505 r]it — 2) + *8809 + 

The above computations have been checked by verifying that 
the autocorrelation coefficients of the four processes equal the 
prescribed values = ‘6000, rg == ‘0900, — — *1500, — — *1000. 

Having now exemplified the construction of sets (b) belonging to 
moving averages (249) with correlograms approximating that of the 
Beveridge wheat price index, we shall postpone the discussion of 
the results arrived at until we have made a few applications of 
the inversion formula (200) and the relation (260). 

If {^(0} is a regular moving average (249), the primary process 
wiU be given either by (200) or by (255), the latter formula 
corresponding to the exceptional case when the characteristic equa- 
tion of the difference relations satisfied by the coefficients (a) pre- 
sents roots of modulus unity. This characteristic equation is nothing 
else than the equation b(.x) = 0 used in the general method exempli- 
fied above. This method being based on the calculation of the 
roots of the equation mentioned, we are in a position to point out 
directly which of the formulae (200) and (255) to apply in the 
different examples, and to carry the analysis further on the basis 
of the general developments in section 26. 

Keturning first to the approach (292), we have to apply (255), 
for the root x = — 1 of b(jx) = 0 is of modulus unity. Proceeding 
as in the illustration 1 of section 26, we get 

(298) 7] it) = lim it) — m — a (^ “ 1) — m) -f (^ — 2) — m) —■•••] 

l>a-— 1 

By construction, the second approach (293) is also such that a root 
of bix) = 0 is of modulus unity. 

In order to get an application of (200), we proceed to the ap- 
proach (294). According to theorem 8, the coefficients ia) will satisfy 
the difference relation Uk + 1*0450 ai—i -f *2730 ak~~2 = 0. The char- 
acteristic equation being bix)^ix + ‘5225)^ = 0, it follows from 
section 6 that ak may be written on the form < 2 ^= (A + i • H) • 
• ( — *5225)^ The constants A and JB may be obtained from the 
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initial values ao = l, %== —1*0450 (c£. (97) and (197)). A short 
calcnlation gives A =5 = 1, and hence ajc = {l XO (~~‘5225)^'. In 
applying formula (200), we have to insert this expression, and to 

replace by — m}, where m = 5[^i(fl]. Observing that 

00 

S < 2 * = (1*5225)"“^ = *4314, we get the inversion formula 

0 

(299) 7] it) = -*4314 m -}- (t) - 2 (*5225) (^ - D 4- 

+ 3 (*5225)^ 2) -4 (*5225)^ ~ 3) 4- • • • 

It is seen that the approach (297) gives a formula of the same 
type, but, of course, with more complicated coefficients (a). 

Denoting the Beveridge index by and assuming that 'Q is 
a sample series of the moving average {Ci(0} given by (294), formula 

(299) may be used for deriving a series f]t which corresponds to 
this hypothesis. Identifying m with the average ni of the index 
series fjt, we get 

(300) f]t = -‘4314 m + -1*0450 ’Qt-i 4- *8190 Ct~2 - '5706 + • • • 

The sum of the 100 index fluctuations given in table 7 being 
—28, we get m= —’28. Inserting this, and using the index devia- 
tions given in col. (1), formula (300) has given the series fjt pre- 
sented in tol. (2). Having put ^^ = 0 for h> 12, the first 12 ?> 
values are partly based on index fluctuations not given in the table. 

Apart from the constant m— — *28, the moving average (294) as 
performed on col. (2) must reproduce col. (1). Thus the values fjt 
obtained may be checked by the simple identity 

(301) 4“ 1*0450 fjt—i 4“ ‘2730 

Because of the large number of terms in (300) and (301), there will 
sometimes be a deviation amounting to '2 or *3. 

In analogy with the above, the series 7^^ corresponding to the 
regular approach (292) may be obtained by a limit passage based 
on the relation (255). Having exemplified in illustration 1 of sec- 
tion 26 the limit procedure for deriving a primary series fjt, it is 
seen that in point of principle no complications will be met. Ac- 
cordingly, we shall not dwell further on the exceptional cases when 
6(x) = 0 presents a root of modulus unity. 

Starting from the hypothesis that a given time series is a 
sample series of a regular moving average (249), we have above 
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Talle 7. Beyeridge wheat frice index fluctuations (col. 1), and 
hypothetical primary series fjt (col. 2). 


Year 

( 1 ) 

( 2 ) 

Year 

( 1 ) 

( 2 ) 

Year 

( 1 ) 

( 2 ) 

Year 

( 1 ) 

( 2 ) 

1770 

31 

23*9 

1795 

30 

17*6 

1820 

-16 

*8 

1845 

15 

19*6 

71 

36 

9-3 

96 

- 5 

- 27*2 

21 

—24 

- 19*3 

46 

39 

19*9 

72 

19 

3*0 

97 

—16 

7*8 

22 

-23 

- 2*8 

47 

-10 

- 35*9 

73 

6 

*5 

98 

-13 

- 13*6 

23 

-29 

- 20*5 

48 

-20 

12*3 

74 

5 

3*9 

99 

20 

32*2 

24 

-29 

- 6*4 

49 

-26 

— 28*8 

75 

~12 

- 15*9 

1800 

39 

9*3 

25 

-31 

- 18*3 

1850 

-22 

5*0 

76 

-16 

- -2 

01 

17 

- 1*2 

26 

-18 

3*2 

51 

-14 

-iri 

77 

- 6 

- 1-2 

02 

5 

4*0 

27 

- 7 

— 5*1 

52 

5 

15*5 

78 

-13 

- 11 -4 

03 

- 6 

- 9*6 

28 

14 

18*7 

53 

38 

25 1 

79 

-21 

- 8*5 

04 

26 

34*2 

29 

3 

- 14.5 

54 

41 

10*8 

1780 

-13 

- ‘7 

05 

14 

- 18*9 

1830 

10 

20*7 

55 

38 

20*1 

81 

-12 

- 8*-6 

06 

- 2 

8*6 

31 

5 

- 12-3 

56 

7 

- 16*7 

82 

- 6 

3*5 

07 

- 7 

- 10*6 

32 

-18 

- 10*6 

57 

-18 

— 5*8 

83 

- 6 

- 6-9 

08 

— 6 

3*1 

33 

-20 

- 5*4 

58 

-19 

- 8*1 

84 

- 8 

- 1*3 

09 

- 6 

- 6*0 

34 

-22 

- 13*3 

59 

- 3 

7.4 

85 

-15 

- 11’4 

1810 

4 

9*8 

35 

-18 

- 2*5 

1860 

16 

10*7 

86 

-16 

- 3*4 

11 

40 

31*8 

36 

-12 

— 5*6 

61 

7 

- 6*0 

87 

- 7 

*0 

12 

21 

- 14*6 

37 

2 

8*7 

62 

-8 

- 4*4 

88 

8 

9*2 

13 

-4 

2*9 

38 

17 

9*6 

63 

-21 

- 14*5 

89 

8 

- 1*4 

14 

- 4 

- 2*7 

39 

7 

- 5*2 

64 

-19 

- 2-4 

1790 

-14 

- 14*8 

15 

30 

32*3 

1840 

— 6 

- 1*9 

66 

- 6 

*7 

91 

-22 

- 5*9 

16 

78 

45*2 

41 

1 

4*8 

66 

19 

19.3 

92 

-13 

- 2*6 

17 

26 

- 29*7 

42 

- 8 

- 12'1 

67 

18 

- 1*9 

93 

-15 

- 10*5 

18 

- 6 

13*1 

43 

-12 

-,■*3 

68 

- 7 , 

- 9*8 

94 

3 

14*9 

19 

-14 

- 19*3 

44 

- 8 

- 4*1 

69 

2 

13*1 


exemplified a general method for deriving the corresponding primary 
series fju We are now in a position to solve the analogous prob- 
lem under the hypothesis that is a sample series of a non-regular 
average (cf. p. 129). After the detailed treatment of the regular 
case, it will be sufficient to deal with the non-regular averages very 
briefly. 

Let the characteristic equation of a non-regular moving average 
{^(0} be given by (cf. (258)) 
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I {X) = ^ + W X’^ ^ ^ + &SjLi X + &W) = 

= (xP + Ax + • • + Ap-i X + A^ {afl + Bx x^^ + • • + 

+ JBq—i X 4- JB^ = Oj 

where j) + g = /i, and where the roots of + • • + Ap = 0 

are lying inside, and those of 4- x^~^ 4- • • + Bg = 0 ontside 

the unit circle. 

The coefficients At and Bi being real, two real sequences pt and 
qi will be defined by the systems (cf. (97)) 

jpn AiPn~l 4* • ' • 4- An—l Pi + An = 0, = 1, 2, . . ^ 1 ; 

\pn + AiPn—1 + • • • 4“ Ap—ipn—p^i + -^vPn—p = 0, p; 

j Bq qn ■!" Bq — i qn — 1 4" * ' ’ 4~ Bq — n+1 Q.i "t Bq — n ~ 9, 1, 2, . q 1, 

\Bqqn + -Bg — 1 qn~l 4* • • • 4- Bi Qn—g+l 4* ^n— g =0, q\ 

Pq=- qQ = l- It is seen that the sum 

ad) Pi - 1) + - 2) + • • •' 

will be convergent. Paying regard to certain evident relations be- 
tween the coefficients Ak, Bjc, and hf, it follows further 

(302) ad)'~==KAvit')'^ S,7]{t-1) + - • + Bq7]{t- q)]/K ^^ . 

Thus prepared, let us form 

/?(© = o: (0 -j- qiccd 1) q2cc(t + 2) -\ 

Observing that the roots of the equation 

BqX^ + Bq^lX^-^ + B^x+ 1^0 

are of modulus less than unity, it follows without difficulty that 
this sum also is convergent, and that = Bq - K ‘ pd q)/ 
Since Bh + 0, we conclude that, apart from a constant factor, the 
repeated linear operation performed will yield {7]{t)} in terms of the 
non-regular moving average considered. 

Comprehending the double transformation, we get 

T^(l) 00 

(303) [r ] «)} = ^ ^ ■ S c» • {C(# + g + w)}, 

JJg ■ -Q- n= — ® 


11-38387. H. Wold. 
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where Co — 1 + i’l S'a + — > 

^ Cn *1" ^71+1 Qi ”h _2^71+2 “f* * ' ■ 71 ^ Oj 

C — ji Q^n "h ^n+lj?^l 0^77+2^2 "I" ‘ ' ‘ 'j ^ ^ 6. 

The generalization to the case when & fe) = 0 presents roots on 
the periphery of the unit circle is, of course, straightforward. 

In order to give an explicit application of the relation (303), let 
us consider the group (Q belonging to the regular average (294) 

studied in detail before. As to (295), we get by analogy from (299), 

and in full agreement with (303), 

(304) 2j(t-2) + •4314m = C2®-2C5225)"^2«+ 1) + 

+ 3C5225)®?,(t +2) 

Considering the remaining average (296) in the group, the coefficients 
appearing in (303) are seen to reduce to == gn = (—'5225)’*. Hence 
Cn=^l”V(l — p% p = — '5225, n^O. Since is nothing else 

than the factor of ij(0 in (296), and JBq = — 1/'5225, we get 

(305) JJ«-1)= i KsU + p = —-5225. 

n = — 00 

By means of the formulae (304) and (305), it is possible to derive 
explicitly the two series fjt corresponding* to the hypotheses that 
the Beveridge index is a moving average (295) or (296) respectively. 
The calculations running as in the regular case, no detailed illustra- 
tion will be given. Of course, the last ?> values can be calculated 
only approximately. It should be observed that a complete check 
may be based on identities similar to (301). 

If a series fjt corresponding to a certain process in a group © 
has been derived, it will sometimes be possible to arrive at the primary 
series, s*ay fjtj corresponding to another average in the group by 
means of a simpler procedure than that based on (303). Bor in- 
stance, representing by Hit and fjt the primary series corresponding 
to (294) and (296) respectively, the relation (260) gives 

■— fjt = p'Fjt — ii — fjt-.i — p (1 ^t-2 

where p== —*5225, and hence p fjt — fjt-i p fjt— i — ‘fjt. The series 
p fjt—i — % say 5*) being readily derived from the series fjt in table 7, 
we get the simple relation (cf. (305)) 
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vt-l =-tt+P &+1 - CU2 + &4 3 

HLaving now given an account of some applications of the general 
theory of moving averages to the Beveridge wheat price index, we 
shall next attach some remarks of general scope to certain points 
in the analysis. Let us in the first place touch upon the problem 
of testing the results derived under, a hypothesis of moving 
averages. 

In a process of moving averages as given by (249), a char- 
acteristic feature is that ^(0 is independent of — Jc) ^OT k>0. 



Fig. 8. Beveridge tvJieat price index, The scaUers C^ty^t-i) {left), and 

. (S, 2) {right). 


Thus, (249) will form no adequate basis for the analysis of a time 
series unless the scatters {^t, Ct-h—i), (^t, a— 2 ), etc. approximate 

distributions of independent variables. On this line it would, of 
course, be possible to develop difiEerent types of tests. — Figure 8 
contains the scatters {^t, ^t—i) and {^t, Ct—2) helonging to the Beve- 
ridge data given in table 7. Figure 8 (left) clearly shows that 5 
5—1 must be considered interdependent. On the other hand, the 
scatter (5, 5 -- 2 ) seems to permit us to look upon 5 5—2 as inde- 

pendent, a circumstance speaking in favour of an average of the 
simple type 5 = ■^^+ 

In starting from a specified scheme of moving averages, we are 
in a position to derive a hypothetical value for any characteristic 
of the series 5 <>f the corresponding series iyi calculated as 

indicated above. In point of principle, every such characteristic as 
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compared with the corresponding empirical one will give a basis for 
testing the hypothetical set-np. 

Of course, when a scheme with correlogram approximating the 
empirical one is chosen, there will automatically be an agreement 
in the main between certain hypothetical and empirical character- 
istics. For instance, the serial coefficients of fjt will approximate 
zero, since the deviations from this hypothetical value will be due 
only to the differences between the hypothetical and the empirical 
correlogram. The situation is the same with regard to the ratio 
between the variances of f]t and 5. Considering e. g. the approach 
(294), the hypothetical value of this quotient is 1/(14- 1*0450^ + 
*2730^) = ‘462. On the other hand, the variances of the series fjt 
and being 213*7 and 383*6 respectively, the empirical quotient 
equals ‘557. 

By construction, the moving averages in a group will present 
the same correlogram and the same variance. We conclude from 
the above that in point of principle the autocorrelation properties 
of the corresponding series fjt will give us no criterion for distin- 
guishing which of the processes should be preferred. Expressing 
the argument in other words, the deviations in fkifji) and I) (fjt) from 
the hypothetical values will depend solely on the differences between 
the hypothetical and empirical correlograms of and these differ- 
ences are exactly equal for all processes in the group (0. 

The forecasts based on the hypothesis of moving averages disclose 
an interesting aspect of the test problem. Denoting by 
the forecast over A time units based on the development up to the 
time point t, the general formula (213) gives in the approach (294) 
the following forecasts 

(306) Ft[^(t-h l)] = m -f l*0450^i -f ‘2730^^™!; 

, Ftg(t + 2)] = m -h‘2730^i; 

and Ff [^(t 4- h)] = m for ^ > 2. In particular, taking t = 181 1, 
table 7 shows that % = 31*8, 1=9‘8. Keeping in mind that 

^j^=^‘ 28, we get (2^+1)] = 35‘6, + 2)] ==2*4, while the 

actual path of the index runs through = 21, ^^+2 == —4, 

Considering on the other hand the approach (295), which belongs 
to the same group as (294), the forecast formulae corresponding to 
(306) would read Ft[l;(t -f 1)] = m + 1.0450 Figd + 2)] = 

= m + However, in this case we cannot derive enii fjt-^i 
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from the observed values 2 , . • . Now, the autoregression 

analysis as developed in Chapter II yields a linear forecast which 
is valid for any stationary process with finite dispersion, and for 
which the squared deviation from the future path is of minimum 
expectation. Writing (ft} for the residual process of the station- 

ary process considered, and paying regard to the results arrived 
at in section 26, this forecast will in the present case reduce to 

(307) Ft K (t + /ft] = h + 

where the sequence (b) is identical with the coefficients in the 
corresponding regular average. According to the general theory, 
{ 7 ^^^‘)(ft} is obtained simply by subjecting {C(ft} to the same linear 
operation which gives the primary process {'rjit)} in terms of the 
regular average. 

In taking the squared deviation as the measure of the efficiency 
of a forecast, we conclude from the above that the different hypo- 
thetical averages in a group (© will give rise to exactly the same 
forecast series Ft CC (t + DJ, Ft [C (t + 2)], etc. A simple illustration 
of this is given by the group (294) — (296). Taking e.g. the process 
(295) for a hypothetical basis, we have in the first place to form 
the series fff. According to the general analysis, this is identical 
to col. (2) in table 7. Applying next th.e general relation (307), 
and keeping in mind that the coefficients (b) coincide with those 
in the regular average (294), it follows that the resulting forecasts 
will equal those previously obtained on the basis of the regular 
average (294). 

Expressing the situation in other words, we have found that the 
different averages in a group (0 are equivalent in view of those 
aspects of the general test problem hitherto considered. Since the 
indeterminateness is due to our having dealt with only the auto- 
correlation properties of the hypothetical models, we have to use 
other types of methods for distinguishing between the different 
averages in a group. Generally speaking, these tests should ex- 
amine to what degree the elements f)t. and fit+jc resulting from a 
special hypothesis might be considered not only uncorrelated but 
also independent. The different averages in a group giving rise to 
different primary series % we must therefore compare in detail the 
multi-dimensional scatters ^t-h)- Eor the sake of con- 

creteness, the scatter {fjt^ 1 ) obtained from col. (2) in table 7 is 
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shown in fig. 9. To my eye, the scatter forms no very good approxima- 
tion to a distribution of independent variables. In performing different 
tests of this nature, it may of course occur that no series fjt will 
give satisfactory results — perhaps one of the scatters examined 
will suggest instead a non-linear scheme, e.g. a moving average 
performed on the non-autocorrelated process {?(©} defined by (286). 
However, it would be beyond the frame of this study to enter 
upon further details concerning the non-linear schemes and the 
non-linear methods required for distinguishing between the different 
moving averages belonging to the same group. 


.JO 0 JO « repeatedly emphasized in earlier 

T"'' I 'ifcf ' I I sections, the applications accounted for 

— — ^0 . 

X . in the present study do not aim at 

. x”. -ic quantitative results. 

- “ “x x ^ " v- . - Accordingly, no attempts will be made 

^ 0 ■to test the significance of the parameters 

* arrived at under the different hypoth- 

--20 eses dealt with. Hence it is out of 

' “■ the question to draw quantitative com- 

parisons with earlier investigations of 

I I I I I I I I o 

rr* n -o 14 ' wheat price data analysed; as is 

¥%g, 9. Beveridge wheat pnce „ , ^ ^ J 

index. The scatter (if belong- known, Sir W. Beveridge (1922) 
ing to a hypothetical primary se- subjected his index to an extensive 


ries periodogram analysis, while G. U. Yule 


(1926) has illustrated certain new correlo- 


gram methods by the use of the Beveridge index numbers. However, 
it will be illuminating to take up some points of these investiga- 
tions for a qualitative comparison with the moving average approach. 

In his presentation of the wheat price data, Sir W. Beveridge 
(1921) gives two series of index numbers, the trend present in the 
first series being removed in the second. In contradistinction to 


Beveridge and to the present writer, G. U. Yule (1926) works 
on the first series. In his search for hidden periodicities, Yule 
modifies the correlogram method because of the trend present in 
the original data. Since Yule’s approach takes into consideration 
the differences of the series analysed, it might be interpreted as 
an application of certain non-stationary or evolutive processes of 
the homogeneous type {^®} defined by (192). The investigation 
thus following a quite different line of research, no further comment 
is called for in the present connexion. 
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The periodogram method being based on the assumption of 
hidden periodicities, its use will result in a hypothetical scheme 
additively built up by a functional element y(f) consisting of a 
number of superposed harmonics, and a random element or » error » 
rj {t). Since a single harmonic involves three parameters, the total 
number of parameters in a scheme of hidden periodicities amounts 
to thrice the number of harmonics superposed. Specifying the 
values for these parameters, we can compute the functional element 
y(t). Deducting y(t) from the original data, say we obtain the 
corresponding path of the random component, = — yit)* 

A common standard measure of the efficiency of the periodogram 
analysis is obtained by dividing the variance of the errors fjt by 
the variance of the original data This quotient, say is al- 
ways below unity, and the closer to zero the quotient, the more 
the functional element y if) will »explain» of the series under 
analysis. Dor instance, the harmonic corresponding to the largest 
ordinate in the periodogram 1545 — 1844 given by Sir W. Beveridge 
{(1922), p. 438) will leave an »error» fjt with variance amounting to 
91 ^ of the variance of the wheat price index. The harmonic in 
question is of period j9~15’225. 

On the other hand, a scheme of moving averages (249) is built 
up by means of a random variable here called primary, and 
a set of coefficients or parameters (b). After having chosen numer- 
ical values for the parameters (6), the general analysis in the present 
section provides a method for deriving from the original series 
the corresponding path Tjt of the primary variable. 

Even in the case of moving averages, the quotient between 
the variances of and ’Qt uiay be taken as a standard measure of 
the efficiency of the analysis. The hypothetical value for being 

in this case x^ = 1/(1 -f -{- {- hlX it is seen that the larger the 

coefficients (&), the less is x^, and the better the result of the 
analysis. Eor instance, considering the -^rvalues in table 7 derived 
from the approach (294), we have, as already mentioned, = ‘557, 

= *462. The small values obtained seem rather satisfactoi-y, but 
it must be remembered that the analysis has been restricted to 
the last 100 Beveridge data 1770 — 1869. Thus, although the 
approach (294) involves only two parameters, it might perhaps be 
unfair to the scheme of hidden periodicities to compare directly 
with the ?-value ‘91 derived from the whole series under the 
hypothesis of one harmonic component. 
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An examination of the forecasts delivered will show clearly the 
thorough-going- difference between the scheme of hidden periodicities 
and the scheme of moving averages. 

In a scheme of hidden periodicities of type (39), the forecast 
curve is identical to the functional element y{i), i.e. the sum of 
harmonics superposed (cf. p. 59). The hypothetical model thus 
will provide a definite functional forecast over the infinite future. 
The expected squared deviation between the actual development 
and the forecast is independent of the period forecasted over, and 
amounts to JD^ {rj). 

A scheme of moving averages gives a quite different type of 
forecast. In fact, considering a moving average (249) ranging over 

+ 1 time units, it is only the forecasts over the next h observa- 
tions that are effective — the forecasts beyond h time units are 
trivial, and reduce to the average of the original data. According 
to the general analysis, the different averages in a group (Q) will 
give rise to the same forecasts, viz. 

Ft (t + Tc)] — m hk^t 1 Vt—i + • • • + S/i fit—h+h 

where the coefficients (6) are those appearing in the regular average 
used as the basis for deriving the primary series fjt. The hypo- 
thetical value for the squared deviation from the actual development 
being (cf, (218)) 

it is seen that the efficiency of the prognosis will decrease gradually 
as the period forecasted over is extended. 

Especially in view of economic time series, the type of forecast 
delivered by the scheme of moving averages seems a priori more 
realistic, seems to correspond better to what might be reasonably 
possible to find out from the past development. Further, considering 
the forecasts over a short period, the prognosis given by the scheme 
of moving averages is, as a rule, rather efficient. In my opinion, 
this is a circumstance of central importance, for often the main 
interest is concentrated upon the prognosis concerning the near 
future. 

As to the harmonic components suggested by a periodogram 
analysis, these cannot always be interpreted in the light of what is 
otherwise known of the phenomena under investigation. Lacking 


IV 31 ] SOME APPLICATIONS OP MOVING AVERAGES 169 

other evidence, the periodicities thus will stand out as quite isolated 
results of the analysis. As pointed out by Sir W. Beveridge 
((1922), p. 438), this is the case with the period of length 15’225 
years suggested in his periodogram mentioned above. — Against 
this background, the scheme of moving averages seems more fertile. 
Let us dwell a moment on this point. 

In modern economic-statistical research work, a prominent line 
of approach is the regression analysis of time series (cf. e.g. C. F. 
Eoos (1934)). Denoting a set of time series by rju . ., fif'K 

a simple type of approach reads 

(308) = Cl fl't fit ‘ ‘ Cn 

where h stands for the residual in the representation of 'Qt as 
linearly correlated with the n variables rjf. A generalized approach 
is obtained by replacing fjf by where the constant ki represents 
the lag of the series fjf behind the series 'Qt- The concept of 
distributed lag implies a further generalization, which in the simplest 
case = 1 leads to an approach of type 

(309) It = 6o + h + Vt-h + ««• 

For instance, as an initial approximation we might represent a 
wheat price (i.e. p in the year t as linearly correlated with wheat 
crops (i.e. f]) in the years — 1, . . ., ^ — h. According to the 
theory of supply and demand, we might expect that in this case 
the dominant regression coefficients h would be negative. 

Disregarding the residual at, and interpreting in the terminology 
of the present study, the series ^t as given by (309) is seen to be 
nothing else than a moving average performed on the primary 
series Consequently, a hypothetical model corresponding to the 
approach (309) will be obtained by adding two independent processes, 
viz. a moving average {&o ^ ® "h h 4- • • ■ -f bhr}{t — h)} and 

a residual process {£(2^)]. 

Having seen that the concept of moving average may be attached 
directly to the concept of distributed lag, the theory of moving 
averages as developed in the present study seems to disclose new 
aspects of lag problems, and to suggest methods for a deeper ana- 
lysis in this field of research. A few remarks on this line will 
follow. 
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The idea underlying the general methods applied in the present 
section is that the autocorrelation properties of a time series will 
reveal whether the process of moving averages is an adequate type 
of hypothetical model for The circumstances of importance in 
this connexion are (A) that in such an analysis no other time series 
are taken into consideration, and (B) that a specified hypothesis of 
moving averages (it should be observed that the different averages 
in a group © must be examined separately!) will determine a 
hypothetical primary series fjt. Accordingly, independently of the 
first stage of the analysis we can compare such a hypothetical 
series rjt with other time series, say fjt, fjt\ etc., thought of as 
possibly affecting the series examined. For instance, in ap- 
plying the approach (294) to the Beveridge wheat price index, we 
have derived the hypothetical series given in col. (2) of table 7. 
This series has been calculated merely in* order to illustrate in 
detail how the general inversion formula (200) works when applied 
to an observational time series, but in case the parameters (6) were 
significant, the series fjt might be compared with e.g. some appro- 
priate wheat crop series Following the suggestion made in 
connexion with the approach (309), we might in such a case change 
simultaneously the signs, in the coefficients ® and the series rju 

In view of the lines of research suggested above, our theoretical 
analysis of moving averages calls for generalizations in various 
directions. Having assumed the primary process {rjit)} in (249) to 
be purely random, it would in the first place be of interest to 
generalize the concept of moving average by removing the restric- 
tions imposed on the primary process {rjit)}, Now, assuming only 
that {rjit)} is stationary and of finite dispersion, the autocorrelation 
coefficients of will exist, and be obtained by a straightfor- 

ward generalization of the relations (250). Considering two time 
series and with correlograms niQ and respectively, the 

generalized relations evidently may be used for finding out approxim- 
ately how a moving average with specified coefficients (6) as per- 
formed on ©i would transform fjcifjt)- If the transformed correlogram 
approximates fkiQ, we ai’e led to investigate in detail whether 
approximates a moving average with these coefficients ib), and with 
for primary series. Concluding from theorem 9 that the repre- 
sentations (200) and (303) hold even for the generalized average, 
this investigation can be performed as suggested in the case of a 
purely random {r]it)}j viz. by deriving directly from the primary series 
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‘fjt corresponding to the coefficients (&), and then correlating or 
comparing in another way the two series fit and rju 

Finally, another generalization is introduced when several series 
fju employed for building up the series (cf. (308)). 

Since such an approach falls under the theory of multi-dimensional 
•stochastical processes, a discussion of this case would be out of 
place in the present study. 

In examining the general methods applied to the Beveridge 
wheat price index, we have laid stress upon the different type of 
forecast delivered by the scheme of moving averages as compared 
with the scheme of hidden periodicities. The approach of moving 
averages seems particularly useful in forecasting over a short period 
of time, and attaches directly to current forecast methods, in par- 
ticular the approach of distributed lag. Having in the previous 
.analysis referred throughout to economic-statistical applications, 
the present section will be concluded by a few remarks concerning 
the applicability of the scheme of moving averages in other fields 
of scientific research. 

In periodogram analysis of geophysical data — e.g. records of 
rainfall, water-levels, temperature, terrestrial magnetism, etc. — 
it is often difficult to interpret the periods suggested as physical 
realities. The situation being the same as in economic applications, 
the claiming of such periodicities has been subjected to severe 
criticism. A fact especially stressed — see e.g. an excellent critical 
survey by D. Brunt (1937) — is that these periodicities can ex- 
plain but a small, often quite insignificant fraction of the variance 
in the observational data. In view of this, the lines of research 
based on multi-dimensional regression analysis seem more promising 
(see e.g. C. W. B. Normand (1932)). Aiming at short time forecasts, 
the realistic hypothesis underlying these investigations is that the 
phenomenon considered is causally connected with other phenomena 
by relations involving distributed lags. The theoretical set-up 
required having been touched upon in our discussion of economic 
applications, it is seen that the methods suggested by the theory of 
moving averages might be used also in these fields of research. 

For instance, representing the water-level in a lake by amoving 
average of the rainfall in surrounding districts, and following the 
method outlined, we are led to compare the water-level correlogram 
with the rainfall correlogram as transformed by the hypothetical 
moving average. By the courtesy of my friend B. Bruno, who 
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lias taken an interest in my studies in time series analysis, I am 
in a position to give an illustration of this on the basis of correlo- 
grams appearing in a forthcoming paper (B. Bruno (1938)). 

Starting from quarterly observations 1807 — 1936 of the level of 
LaJce Vdner, 130 yearly data were obtained by simple averaging. 
Using formula (13), Bruno has derived a set of serial coefficients 
fh(Q for each of the periods 1807 — 1936 and 1871 — 1930. The 
resulting correlograms are shown in fig. 10. 

If a strict periodicity were present in the material, the two 
correlograms n-(0 should rise simultaneously to a maximum with 



Fig. 10. Correlogram of the level of Lake Y liner 1807 — 19B6 (thick line), and 
1871 — 1930) (broken line). Transformed rainfall correlogram (small rings). 


abscissa equalling the length of the period. However, the correlo- 
grams fl.uctuating rather inde|)endently, no period is suggested in 
this way. Anyhow, in view of the smallness of the serial coefficients 
fk(Q} obtained for > 1, a hypothetical period cannot be expected to 
explain a significant part of the variance in the series 'Qt under analysis. 

To my eye, the correlograms suggest more definitely a scheme of 
moving averages of the simple type {??(^"-l)}, 

where {17(0} is purely random. In fact, only the coefficient is rather 
large in both the correlograms, while — as should be the case if the 
remaining serial coefficients were chance iDroducts — those based 
on the longer period of observation are, on the whole, lying closer 
to zero. Taking ri(^)=='4 for the hypothetical value of fi(D7 
have seen in illustration 2 in section 26 (see. p. 131) that the 
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corresponding' group {Q of averages is constituted by the two 
processes {rjit)} -h emd ‘ 5 { 72 (©} + {r](t-l)}. 

B. Bruno has further constructed the correlogram of a series 
f/t obtained bj averaging- the yearly rainfall 1867— 1936 in four 
cities in or near the drainage-basin of Lake Vaner, viz. Falun, 
Karlstad, Yanersborg, and Oslo. The correlogram obtained being 
shown in fig. 11, it is seen that the deviations from zero of fkif/) 
are rather irregular and small. 

Assuming as a first approach the serial coefficients niff) to be 
insignificant, we obtain a closed hypothetical model of the two 



Fig. 11. Correlogram of the rainfall 1867—1936 in the drainage hamn of Lake 
Vaner (thick line), and the same correlogram transformed by formula (310) (small rings). 

series and fj\ by regarding the rainfall series f}\ as belonging to 
a purely random process [rf if)], and the water-level series as 
belonging to one of the averages in the group constituted by 
{rj'if)} + '5{r]'it—l)} and 'b{rfif)] + {rj'it—l)}. 

As suggested on p. 170, this simple model might be generalized 
by cancelling the assumption that {rj'if}] be purely random. Letting 
in such a case the autocorrelation coefficients of {rj'if)} be represented 
hj nio]'), a short calculation shows that those of {C^(C} = &o + 
+ &i -1)} will be given by 

®o + it) • Tki'rf) 4- ioiil^k+iiT}') 4- rjc—iir]')] 

yO xUj ^ k\is ) 72 I 1 2 I O 7. I. f ^ 

(cf. a related formula given by G. TJ. Yule (1926}, Appendix II). 
Replacing nirf) \)j this formula gives with good approximation 
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the correlogram of the series Puttings 6o==l, and 

= '6, we have in this manner obtained the transformed correlo- 
gram indicated by small rings in fig. 11.^ For direct comparison 
with the water-level correlograms, the coefficients f ^ {f{) as transformed 
by (310) have also been plotted in fig. 10. The parallelism with 
the correlogram based on the period 1871 — 1930 is rather encourag- 
ing — in 10 cases out of 13 the rings and the coefficients niQ 
vary in the same direction. Observing that the transformation 
(310) is symmetrical in respect of the coefficients (6), we are thus 
led to examine in detail to what extent the water-level may be 
approximated by a moving average of the rainfall fft of the simple 
form -h ■ 6 fjt-i or * 6 + fjt—i. 


32. Some applications of the scheme of linear autoregression. 

Having in Chapter III investigated the process of linear autore- 
gression on the basis of a theory of stochastical difference equations, 
we shall in the present section give a few applications of this 
scheme. In doing this, we shall proceed as in the previous section. 
Choosing an economic time series as our experimental object, we 
shall first illustrate in detail a general method for determining the 
parameters when applying a scheme of linear autoregression. The 
modest purpose being to show how the method works, the tests 
touched upon in the following discussion will not be applied for 
examining the significance of the parameters arrived at. In discus- 
sing the results, the analysis will instead be focussed on a qualita- 
tive comparison with other hypothetical schemes, and with certain 
related lines of economic-statistical research work. Following up 
the parallelism with the applications of the scheme of moving 
averages, this section wiU be concluded by referring to a few other 
fields of scientific research which invite an application of the scheme 
of linear autoregression. 

The time series dealt with in the experiments accounted for in 
the following is the Swedish cost of living index 1830—1913 


^ Similarly, (310) gives the serial coefficients of the series (/5f) extracted in table 2 
in terms of fjk as given on p. 50. 
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compiled by Gr. Myrdal ((1933), Table A, budget b). Since tbe 
index presents a marked trend, this had to be removed before 
starting the analysis (cf. p. 146). Having used the 21 -term formula 
of J. Spencer for this purpose (see e. g. E. T. Whittaker and 
G. Eobinson (1926), p. 290 f), the deviations from the graduated 
index are shown in fig. 12 (thick line). The graduation being 
disturbed by a rapid rise of the index in the years 1851 — 57, the 
values obtained for this period were subjected to a slight adjust- 
ttient (broken line). Another graduation by hand was used for the 
years 1904 — 13 not covered by the Spencer formula (broken line). 
In order to avoid decimals, the resulting 84 deviations 1840 — 1913 
were multiplied by 10. These index fiuctuations, say constitute 
our experimental series, and are shown in fig. 12. The numerical 
values of are given in col. (1) of table 8. 


Table 8. Fluctuations in the G. Myrdal cost of living index (col. 1)^ 
and hypothetical primary series f]t (col. 2). 


Vear 

( 1 ) 

( 2 ) 

Year 

( 1 ) 

( 2 ) 

Year 

( 1 ) 

( 2 ) 

Year 

( 1 ) 

( 2 ) 

1840 

0 


1860 

-33 

19*7 

1880 

-14 

36*9 

1900 

34 

5*5 

41 

16 


61 

21 

12.6 

81 

32 

2*1 

01 

- 5 

- 9*1 

42 

31 


62 

58 

20*4 

82 

24 

- 5*9 

02 

- 9 

18*8 

43 

- 6 


63 

8 

- 42*4 

83 

40 

23*3 

03 

- 2 

11*2 

44 

-66 

- 29*6 

64 

-36 

- 11*4 

84 

24 

3.7 

04 

-30 

- 21*2 

45 

-21 

21*8 

65 

-39 

- 5*2 

85 

- 3 

14*5 

05 

-21 

3*6 

46 

16 

6*9 

66 

-13 

12*2 

86 

-35 

- 11*9 

06 

-14 

- 17*6 

47 

35 

16*9 

67 

43 

36*5 

87 

-60 

- 23*6 

07 

29 

31*9 

48 

9 

- 18*0 

68 

71 

26*6 

88 

-26 

1—1 

08 

33 

- 2*2 

49 

-13 

- 1*6 

69 

- 1 

- 33*8 

89 

12 

- 1*9 

09 

8 

*6 

1850 

-36 

- 17*0 

1870 

-59 

- 16*7 

1890 

29 

2*7 

1910 

- 5 

6*1 

51 

-50 

- 19*0 

71 

-54 

- 3*0 

91 

66 

30*4 

11 

-34 

- 17*8 

' 52 

-58 

- 36*4 

72 

-36 

- 10*5 

92 

39 

8*8 

12 

31 

65*4 

53 

-64 

- 49*0 

73 

25 

26*8 

93 

2 

12*2 

13 

24 

- 17*4 

54 

-48 

- 38*6 

74 

48 

- 3*0 

94 

-42 

- 13*3 


-198 

— 47-5 

55 

-11 

- 21*1 

75 

30 

1*6 

95 

-26 

21*9 




66 

68 

38*5 

76 

32 

30*3 

96 

-38 

- 32*4 




67 

66 

- 3*7 

77 

31 

30*2 

97 

-19 

1*9 




58 

-48 

- 62*8 

78 

-33 

- 22*6 

98 

. 11 

— 6*0 




69 

-92 

- 19 * 3 ^ 

79 

-84 

- 30*0 

99 

39 

17*9 
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Our series is seen to reflect clearly the changes between 

economic expansion and contraction. A certain regularity seems to 
be present in the movement np and down, but the distance between 
two adjacent maxima is rather inconstant, varying between some 5 
and 10 years. The structure of the fluctuations is summed up in 
the correlogram in fig. 14 obtained from formula (13). The serial 
coefficients are given in col. (1) of table 9. 

The correlogram looks rather like a simple damped oscillation, say 
C ' ’ COS {Xk (p). An inspection of the graph shows that in 
approximating the correlogram by such a function we would have 
to take the period p — 27tll to be about 7 or 8 years, the phase 
(p to be approximately vanishing, and the latter relation 

corresponding to a damping of some 50 % in the duration of one 
period. 

According to the theoretical developments in section 25, a process 
{^(©} of linear autoregression as defined by a relation of type 

(311) C® + -f = 

will present a correlogram forming a simple damped harmonic. On 
the other hand, in a scheme of hidden periodicities, each of the 
harmonic components will give rise to an undamped harmonic in 
the correlogram. We conclude that such a correlogram cannot 
approximate the graph of serial cofficients unless at least two 
harmonics are superposed. However, such a scheme would involve 
6 or more parameters, while only two are required in (311). In 
seeking for a simple hypothetical scheme with correlogram approxi- 
mating fic as shown in fig. 14, we are thus led to try first a process 
of linear autoregression, and firstly one of the simple type defined 
by (311). 

In a general process (219) of linear autoregression, the linear 
system (222) with coefficients m will deliver the autocor- 

relation coefficients r-y, . ., n-i required for deriving the following 
coefficients r/i, rn+i, etc. from the difference relations (221). In 
searching for an adequate scheme (311), we are confronted with 
the inverse problem, viz. to find a set of coefficients Uh 

giving rise to a hypothetical process (311) with correlogram 
approximating a prescribed, empirical correlogram. 17ow, observing 
that the relations (221) — (223) are linear also in respect of the 
coefficients (a), a convenient starting point for attacking the problem 
before us would be to replace the coefficients ri in (222) and the 

12—38387. B, Wold, 
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last relation (221) — a system whose determinant, z/(r, ^ 2 — -I), is 4=0 — 
by the prescribed values n, and to solve instead for the coefficients (a). 
The system yielding our trial set (a) thus will read 

■fl 4- -h <352^1 + + dhfh—l == 0 

2^ ^ ^2 % ^'1 d" ^2 + % 4" ' ’ * 4 = 0 

^ fh 4 - r/i— 1 4- fh —2 4- Th—s 4- • • • 4 ah = 0. 

If the roots of the equation (34) formed by the resulting coefficients 
[a) are lying in the unit circle, these coefficients will define a pro- 
cess (219) of linear autoregression. By construction, the autocorrela- 
tion coefficients . ., rn of this process will coincide with the pre- 
scribed values. The following coefficients r/ 1 + 1 , ni-f 2 , etc. will be 
obtained from the difference relations (221) formed by the hypo- 
thetical coefficients (a). 

Having derived the correlogram rjc of the hypothetical process, 
we are in a position to compare it with the empirical correlogram n*. 
If the fit seems satisfactory, we may carry the analysis further on 
the basis of the coefficients (a) found, but if the deviations seem 
too large, an adjustment in the coefficients (a) is called for. 

In analogy with the case of moving averages, we obtain accord- 
ing to (219) a primary series 

(313) = — — m) 4- * • • • 4- an {^t-h — m) 

corresponding to our set (a). In other words, the complete hypo- 
thetical model will consist of h parameters {a) and the primary 
series using these quantities, we can reconstruct the original 
series In this case, too, we may look upon the quotient 
— as a measure of the efficiency of the analysis — the 

closer to zero our the less important is the »unexplained» random 
component, and the greater the efficiency of the approach. 

As mentioned in section 29, the method of G. U. Yule (1927) 
for determining the coefficients ia) in the approach (313) is to min- 
imize The sum being constant, this method will make 

a minimum, and may thus be said to be of maximum efficiency. 
Having stated this, let it be pointed out explicitly that the Yule 
method gives a system for determining the coefficients (a), which 
is equivalent to the system (312) started from above. In fact, mi- 
nimizing as defined by (313), we obtain a set of normal equa 

tions, where, apart from a constant factor, the coefficients of the 
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a/s will obviously approximate tbe corresponding serial coefficients 
in the system (312). 

The variance of the process {?;(©} defined by our hypothetical set 
(a) will be given by formula (220). Of course, this relation holds 
irrespective of the method used in determining the set (a). On the 
other hand, choosing the system (312) for determining the coeffi- 
cients (a), the resulting primary series Tjt will evidently satisfy the 
parallel relation 

(314) D' (7]t) + a, + a, + • • • + r;,) • D' ©, 

where the sign c\^ covers the approximation made in disregarding 
the first h terms in the series for which we cannot calculate 
corresponding elements fjt. In other words, the variance of the 
primary series f]t will approximate the hypothetical value It 

must be kept in mind that this will not always be the case if the 
trial set {a) is determined otherwise than by (312) (cf. p. 145). 

Summing up, the system (312) will give us a set ah which 

will minimize the variance of the corresponding residuals rjt. Fur- 
ther, the first h autocorrelation coefficients will coincide with the 
corresponding serial coefficients. However, the hypothetical corello- 
gram will not always in its whole range yield a good fit to the 
empirical correlogram. In practice, we must compromise between 
the two desiderata of obtaining small residuals and small devia- 
tions between the correlograms, and besides try to satisfy the rela- 
tion Before discussing this matter, let us see in de- 

tail how the method outlined will function when applied to the 
G. Myrbal cost of living index. 

Forming the system (312) for h—2, inserting the values fi=*5216, 
— ‘2240 given in col. (1) of table 8, and solving for the coef- 
ficients ai and we obtain '*8771, ag = *6815. The roots 

of the characteristic equation ^ + <^2 = 0 being *4385 ± ‘6994 

and thus less than unity in modulus, we conclude that the relation 

(315) ^a)--*8771 ^tt--l)+'6815C(^--2) = 72(e 

will define a process of linear autoregression. By construction, the 
first two autocorrelation coefficients of this process {^(0} read = 

= ‘5216, rg === fg == ”■2240, while the following coefficients will 
be obtained recurrently from the difference relation r* + *8871 rA-*— i” 
— *6815 rfc _2 = 0. The resulting correlogram, which is evidently of 
the form (243), is shown in fig. 14 (thin line). 
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Table 9. Serial coefficients fk of the G. Mtbdal cost of living mdex 
[col. (l))\ a 7 id aiUocorrelation coefficients Vk belonging to the schemes 
(317) [col. (2)), and (318) [col. (3)). 


h 

(1) 

(2) 

(3) 

k 

(1) 

(2) 

(3) 

1 

*5216 

*5216 

*5386 

11 

-*1633 

-*1722 

-•2170 

2 

-*2240 

-*2240 

-1460 

12 

-*2630 

-*0218 

-•1536 

3 

-5811 

-5811 

-*5024 

13 

-*2254 

*1065 

-•0119 

4 

- 4626 

-*4626 

—*5105 

14 

-*0042 

*1311 

•1042 

6 

-*0963 

-*0734 

-*2320 

16 

*1883 

*0609 

.1318 

6 

*2036 

*2749 

*1417 

16 

*1601 

-*0333 

•0747 

7 

*3138 

*3538 

*3458 

17 

*0723 

-*0818 

-•0140 

8 

*2613 

*1717 

*2902 

18 

-*0136 

-*0630 

-•0743 

9 

*1434 

-*0820 

*0772 

19 

-*0067 

-‘0062 

-•0766 

10 

- 0034 

-*2172 

-*1321 

20 

*0342 

*0408 

-'0327 


Comparing with the empirical correlogram, it is seen that the 
period in the hypothetical correlogram is too short, and that the 
damping is a little too heavy. According to section 6, the damping 
factor equals V while the period is given by p = 2ft/X, where 
cos 2 == — af’i y a^. Thus, an increase in will bring on a slighter 
damping. Further, reducing X we obtain a longer period. However, 
as pointed out in the previous discussion, we cannot conclude with- 
out further evidence that it will be possible to improve the fit — 
the coefficients and determine also the constant factor and 
the phase of the damped harmonic, and it might happen that an 
adjustment in ai and a .2 would cause such a change, e. g. in the phase, 
that the total result of the adjustment would be a poorer fit. Pro- 
ceeding with the illustration, we shall next examine the total effect 
of an adjustment. 

In the correlogram of the approach (315), the period is found to 
be 6*22 years, while a good fit would require a period not below 
7 years. Eeducing the damping by increasing % from *6815 to 
*77, a short calculation will verify that = I’lO will give a period 
= 7 *03 years. Now, let us examine the approach 

(316) ^(0- 1*10 1)4- *77 ^(15 -- 

^ In reading the proofs, a slight error was discovered in the serial coefficients: 
they shonld all have been mnltiplied by a factor ainoiinting to about 1*005. 
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The correlogram of the process thus defined has Keen cal- 

culated from (221) and (222), and is shown in fig*. 14 (broken line). 
Up to rg and Tq, the hypothetical correlogram seems to fit rather 
well. Beyond this point, the fit is less satisfactory, partly because 
the graph of serial coefficients presents a slow descent to the min- 
imum in h 12 6, and a rapid rise to the next maximum. This 
skewness will be recurred to later. 

A clear view of the adjustment will be obtained by calculating 
the roots of the character- 
istic equations of (315) and 

(316) . In fig. 13, these roots 
are indicated by small rings. 

In drawing the conjugate 
roots nearer to the peri- 
phery of the unit circle, the 
damping has been reduced, 
while the reduction in the 
angle I has elongated the 
period. 

Bach of the approaches 
(315) and (316) gives rise 
to a primary series and 
we know from a previous 
remark of general scope Adjustment paths in the approaches 

that the improvement in (315) (rings), and (318) (crosses). 

the fit of the hypothetical 

correlogram is obtained at the expense of an increase in the variance 
of the series fjt. In addition, the approach (315) being based on a 
system of type (312), the relation (314) will hold. 

Thus, paying regard to (220), and inserting ^^=—'8771, ^2 = 
== ‘6815, ri = *5216, = — *2240, we obtain in this case 

^jr) 2 (^) == -390 to the residuals fjt derived from the ap- 

proach (316), the development 

(317) 1)^ {fjt) = 1)^ (J^t + % 2 )^ 

^ (1 H“ 4" ^22 "h 2 (3^]^ + 2 ^2 ^2 2 ct ^ ^ 1 ) * ^ 

will not reduce to (314). Inserting 1*10, (3^2 = 77, = 5216, 

—•2240, we find in this case (^i)^:^^ *427 (&). In full agree- 

ment with the general theory, the adjustment in the coefficients (<^) 
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has reduced the efficieucj of the approach. As mentioned in 
connexion with (314), the hypothetical variance (?;>) will also be 
affected by the adjustment. Generally speaking, nothing compels 
B^{r}) to follow the variation in Actually, in the present 

case D^irj) varies contrarily to I)^ (fit) — the first two autocorrela- 
tion coefficients in the scheme (316) being 7\ = ’6215, rg = — *0864 
(cf. fig. 14), it is readily verified by inserting these values together 
with ai = “-l’10, — in formula (220) that the variance in 

question will be given by D" {rj} = ‘250 



Fig. 14, Correlogram of the G-. Myedal cost of living index (thick line), and 
hypothetical correlogram corresponding to fonnula (Bio) (thin line), and formula 

(316) (broken line). 


Summing up the comparison between the approaches (315) and 
(316), the hypothetical correlogram fits better in the latter case, 
but the variance (rji) is smaller in the former case and coincides 
with the hypothetical variance l)^(j]) — in the approach (316) the 
difference between these variances is rather large. All in all, 
neither of the schemes seems adequate. In view of other experi- 
ments with approaches of the simple type (311), it seems as if we 
cannot find a satisfactory approach without taking into account 
more distant elements 5 - 3 , ^^— 4 , etc. 

Of course, having different desiderata to comply with in applying 
a scheme of linear autoregression, we could agree on which weights 
to attach to them, and then take them into consideration simul- 
taneously. In point of principle, it would then be possible to find 
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out a set (a) forming the best compromise in the sense -agreed. 
However, judging from certain experiments of this kind, what might 
be gained in this way seems not worth the extensive computations 
involved. One way or another, the results arrived at in these 
experiments merit no recital. Accordingly, proceeding to an account 
of certain experiments with a scheme (219) involving four parame- 
ters to), we shall follow the same line as before. 

Taking /i == 4, and inserting in the system (3 12) the serial coef- 
ficients f]c given in table 9, we arrive at the following approach 



Fig. 15. Correlogram of the G-. Myrdal cost of living index (thick line), and 
hypothetical correlogram corresponding to formula (818) (thin line)^ and form%ila 

(319) (broken line). 

(318) c (e - ‘8100 ^ - 1) + ‘,7452 - 2) -- *0987 ? - 3) + 

+ -2101 a^-4)==7?«). 

The autocorrelation coefficients of the process {^(f)} thus defined 
have been calculated from (221) and (222). The values found are 
given in col. (2) of table 9, and plotted in fig. 15 (thin line). 

It is seen that the approaches (316) and (318) give rise to almost 
coincident correlograms, only that here we have by construction 
rk = fk for >fc < 4. As shown in fig. 13, this conformity is reflected 
in the characteristic equations • — in the present case two of the 
four roots are lying in the close neighbourhood of the roots be- 
longing to the approach (316). The numerical values found for 
the roots in the approach (318) read *5385 +. *6814 ‘1335 i 5106 
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In adjusting tke approacli (318), we are at liberty to move the 
roots of the characteristic equation in any directions, keeping 
in mind that complex roots must be conjugate. The dominant 
component in the correlogram evidently corresponds to the roots 
'5385 ± ’6814 ^, say q • The period of this component is nearly 
7 years, while the empirical correlogram suggests a somewhat longer 
period. Now, reducing the angle X so as to obtain a period equal- 
ling 7‘5 years, the resulting set of coefficients (a) gave rise to a 
correlogram with reduced amplitudes. Neutralizing this effect by 
reducing the damping by means of a slight move towards the 
periphery of the unit circle, it was found adequate to perform a 
simultaneous move in the other two roots. As a matter of fact, 
the deviations between rk and n- for A == 1 — 4 caused by the ad- 
justments mentioned were found to be substantially reduced by 
moving the root — ‘1335 + *5106 ^ in a direction nearly opposite 
to the adjustment in the root ‘5385 + ‘6814 i. 

Having thus arrived at the roots ‘5888 ± *6540 ?*, — ‘20 ± *58 
the paths followed are indicated in fig. 13. As is readily verified, 
the adjusted roots belong to the approach 

(319) - ‘7776 Ctf - 1) + *6797 *1342 C 3) + 

+ *2914 4) 

The autocorrelation coefficients of this scheme as derived from the 
system (222) and the relations (221) are given in col. (3) of table 9, 
and plotted in fig. 15 (broken line). The fit to the empirical correlo- 
gram is not very close, but the general shape of the hypothetical 
correlogram is rather satisfactory. Comparing with the approach 
(316), which involves only two parameters, it is seen that the 
improvement bears chiefly upon the period and the coefficient ri. 

Using formula (314), and paying regard to the identities = 
h < 4, the approach (318) gives D“ {t]) = *371 ©, (f;)== *371 jD^ ®. 

Comparing with the simple scheme (315)> it is seen that the reduc- 
tion of the factor in D^irj) and 3^(7)) amounts to only ‘019. In 
other words, the introduction of two more parameters has brought 
on but a slight increase in the efficiency of the approach. However, 
comparing the adjusted schemes, we find a definite improvement. 
Proceeding as in (317), we find in the first place tha,t the residuals 
derived from (319) will satisfy the relation 3^ (^) jn *381 3^ © 
(working directly on the series and fjt given in table 8, we find 
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jD^ == ’385 _D^ The adjustment has thus reduced the effi- 
ciency of the approach but very slightly. On the other hand, 
applying formula (220), we find that the parallel hypothetical relation 
reads D" (if) = *401 (^). Contrary to the situation in the approach 

(316), it is seen that l)^('fjt) in the present case approximates (t]) 
even after the adjustment. 

Perhaps it would be possible to find an adjustment improving 
the approach (319). However, in view of the above figures, not 
much can ^ be gained by a continued adjusting. Nor does it seem 



Fig. 16. G. Myrdal cost of index 'Qt. The scatters (^t, i) OeftJ, and 

(Sl &-3) (rigU). 


as if a real improvement could be secured by enlarging our set (a). 
Be that as it may, the above examples are sufficient for our purpose 
of illustrating a general method for deriving a trial set of coeffi- 
cients (a) in applying a process of linear autoregression, and for 
performing adjustments in the trial set found. 

Proceeding to a discussion of the approach (319), the same general 
viewpoints present themselves as in the case of moving averages. 
Accordingly, referring to the remarks in the previous section (cf. 
p. 163), we need draw attention only to a few circumstances which 
are peculiar to the scheme of linear autoregression. 

As indicated by the term proposed, the hypothesis of linear auto- 
regression implies a linear regression upon of each of the preced- 
ing elements 5—1, 5— This circumstance having already 
been employed by G. U. Yule (1927) for testing an approach of 
this kind (cf. section 29, p. 141), we show in fig. 16 the scatters 
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Ci, and iCh &~ 3 ) of the fluctuations in the Myrdal index. 

The deviations from linearity in the connexion between the vari- 
ables do not seem disturbing. 

As in the case of moving averages, different tests of a scheme 
of linear autoregression may be based on the hypothetical random- 
ness of the primary series fjt. Such tests may, for example, be 
focussed upon the scatters The scatter formed 

by the residuals fjt given in table 8 is shown in fig 17. 

The above mentioned skewness in the correlogram of the G. 

Myrdal index (see fig. 14) gives an in- 
teresting illustration of the difficulties of 
<(« testing time series schemes. Is it permit- 
ted to look upon the deviation from the 
hypothetical correlogram as produced by 
pure chance? An examination of this 
■“ question must pay due regard to the 
interdependence of the serial coefficients 
in a sample series. For instance, a chance 
deviation in 7i;==13 would perhaps be 
Fig, 17. Scatter (rip of a xnost often attended by such deviations 
hypoihetical primary series of neighbouring coefficients that the 

of living total picture would present a skew oscilla- 

tion. The question of how much weight 
to attach to deviations of this and similar kinds seems extremely 
intricate. Perhaps nothing better can be done than to compute a 
large number of model series correlograms, and then to compare 
the deviations from the hypothetical curve. 

As pointed out in section 24, an approach of linear autoregression 
will give rise to a forecast curve which forms a damped oscillation 
of type (33), and satisfies the same linear difference equation as the 
autocorrelation coefficients. Letting as before Ft {t + k)] represent 
a forecast over h time units, formula (214) shows that + ©] 

may be conveniently computed recurrently. For instance, consider- 
ing the approach (319), we get 1)] = (1 4- aj + 4- % + %) • 

- m — % ^t^i — % ^t—2 — ^4 Cm, Ft [C (t 4 2)] = (1 4 ax 4 % 4 

4- ag 4 a^) • m — ax • Ft [C (t + 1)] — — % Cm, etc. In 

particular, inserting for if =1912 the values of Ci g'iven in col (1) 
of table 8, we find Fx 9 i 2 [C (1913)] = 41*4, JV 1912 [C (1914)] = 5*2, etc. 
The forecast curve JVi9i2[C(1912 4 A)] is shown in fig. 12 (dashes 
and dots). A check of the first forecast is obtained by deducting 
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tlie hypothetical residual (»randoin shock») ^1913 == — 17 ‘4; the result 
is 24, which equals ^1913. Tig. 12 shows also the forecast curve 
JPi9i3 [^(1913 + I")], (dotted line). 

The two forecast curves in fig. 12 yield a good illustration of 
the prognosis situation in an approach of linear autoregression. 
Firstly, while a forecast + ^)] is often rather efficient for 

small 7c- values, the efficiency vanishes asymptotically as 7c increases 
(cf. p- 165). Further, as soon as we are in a position to take a 
new' observation into consideration when forming the prognosis, 
the forecast curve is often substantially modified; how much, will 
depend on the residual = 1)]. — Summing up, 

it is the short forecasts that are efficient. In this respect, we meet the 
same situation as in the scheme of moving averages, and the same 
contrast to the scheme of hidden periodicities (cf. p. 168). On the 
other hand, under special circumstances the oscillations in a scheme 
of linear regression are nearly functional, viz. nearly strictly periodic 
— as remarked in discussing the sinusoidal limit theorem of E. 
Slutsky (cf. p. 120), processes of hidden periodicities can be obtained 
as limit cases of the schemes of linear autoregression. 


As pointed out in section 21, the scheme of hnear autoregression 
constitutes the proper starting point when studying oscillatory 
mechanisms which are subjected to random impulses. A typical 
approach of this kind is formed by the complete systems as dealt 
with in several recent economic studies (see e. g. E. Frisch (1933), 
J. Tinbergen (1937)). A simple example of a complete system is 
given by 


(320) 


l{t) = c,i;{t-\)'^7)\t), 


For instance, as a first approximation we may take the production 
volume ^(f) of a commodity to be linearly correlated with the price 
^(^— 1), and the price ^0 be hnearly built up by the production 
volumes ^ if) and ^ (?^ — 1). 

In this connexion, the pertinent thing is that a complete system 
may be reduced to a single relation involving but one of the 
fundamental variables etc. Considering e. g. the simple case 

(320), we get at once 


{321) 


^if) + - 1) + a^^it - 2) = r] if), 
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where and are constants, and rj is linear in the variables tj' 
and 7]'\ 

Of course, in order, to study a complete system in detail, we must 
consider stochastical processes in several dimensions, a g*eneraliza- 
tion not in the program of the present study. However, it is 
evident that theorem 9 applies directly to the reduced relations 
exemplified by (321). Thus, under general conditions concerning 
the variables rjit) and the constants (a), the variables ^(0 form a 
stationary process. In point of principle, we are in a position to 
investigate the properties of this process {^(0}. Considering e. g. 
the autocorrelation coefficients, it follows without difficulty that 
these will satisfy a linear difference equation with a right member. 

It is seen that the theory of linear autoregression as developed 
in sections 24 and 25 covers the case when the complete system 
reduces to a relation with a purely random right member. We 
have seen that this analysis has given certain results which cannot 
be reached by functional methods. For instance, if the left member 
of the relation (321) is characterized by an intrinsic damped 
oscillation, and if the damping is too heavy, the tendency to 
periodicity cannot be distinguished in the mechanism as subjected 
to random impulses. Another example of this is given by the 
approach (319), which presents two intrinsic periods. Having 
already mentioned that one of these equals 7*5 years, a short cal- 
culation will verify that the other period is 3 ’44 years. How, as 
shown in fig. 13, the root corresponding to the latter period is 
lying rather near the centre of the unit circle, which implies that 
the damping is rather heavy. Thus, even if the period 3’44 is 
quantitatively reliable and significant — which seems to me rather 
doubtful — we cannot conclude without further analysis that this 
period can be found by a periodogram construction or similar 
methods. Be that as it may, as pointed out before I attach no 
importance to the quantitative significance of the above analysis of 
the Myrdal index. 

Starting from explicit assumptions about the relations between 
the different variables in an economic system, B. Lundberg (1937) 
has examined how the system will develop from hypothetical initial 
conditionsr Since several of the relations assumed are non-linear, 
his approach may by looked upon as a generalization of the linear 
systems as exemplified by (320). Now, the analysis of E. Lundberg 
shows that the variables wifi often present tendencies to diverge 
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as an economic expansion goes on, tendencies causing tensions 
whicli make the economic system instable. Of course, in point of 
principle it would be possible to apply stocbastical methods even 
in this approach. However, it seems extremely difficult to obtain 
in this way a non-evolutive scheme for the system considered. 
Anyhow, in view of the investigations of E. Lundbero, the linear 
approaches earlier discussed seem far from sufficient for giving an 
adequate hypothetical model for studying the economic cycles in 
detail. Looking upon the approach (319) from this viewpoint, it 
might be said that even if this scheme does correctly sum up the 
main features of the index examined, the interpretation in terms 
of oscillatory mechanisms which is suggested by such a simple 
approach cannot possibly be completely realistic. 

Having in the previous section mentioned certain geophysical 
phenomena which invite to studies on the basis of the scheme of 
moving averages, this section will be terminated by a few suggestions 
about the wide applicability of the scheme of linear autoregression. 

As surveyed in section 29, G. U. Yule (1927) introduces the 
concept of autoregression in studying the 11 year wave in the 
sunspot numbers. Further, since the criterion of J. Bartels (1935) 
(cf. p. 26) suggests that certain waves in terrestrial magnetism are 
»quasi“pei"sistent», the construction of this criterion makes us expect 
that in these cases an autoregression approach wiU be fruitful. Of 
course, here the scheme of linear autoregression suggests itself also 
a priori. For instance, let us consider the 27 day wave, which 
is due to the sunspot intensity and the rotation of sun. The 
duration of a sunspot often being rather long, the sunspot intensity 
as observed in a time point t, say must be positively correlated 
with the intensity 27 days earlier. Having stated this, it seems 
plausible that a correlogram of terrestrial magnetism will present 
an oscillation with a period of about 27 days. We must also 
expect that the oscillation in the empirical correlogram will be 
damped, and that the degree of damping wiU depend on the 
average duration of a sunspot. 

It is not difficult to point out other instances where the scheme 
of linear autoregression seems plausible on theoretical grounds. 
Even the simple scheme (237) involving but one parameter might 
often prove useful, at least as a first approximation. For instance, 
we have already found that the scheme (284) presents a correlogram 
which fits rather weU to the correlogram of air pressure examined 
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by Sir G. Walker (1931). Id this case, the antoregression 
obviously may be interpreted as an effect of inertia — generally 
speaking, we may always expect an autoregression of type (237), 
and with a positive constant p, when dealing with phenomena 
characterized by irreversibility. According to formula (239), the 
expectance corresponding to a long period is in such cases larger 
than in a purely random series. In other words, there will be a 
tendency to spurious periodicity, a tendency of the observational 
time series to present long waves the lengths of which are varying 
and without physical significance. 

Finally, it is evident that the theory of autocorrelation may be 
applied to the functional transform (21) used by IST. Wiener (1930) 
in the theory of light. Following up this idea, we are led to 
interpreting the transmission of light as a stationary process. With 
suitable arrangements about the dispersion of the process, the expres- 
sion (21) would then correspond to the autocorrelation coefficients, 
while the function S il) as given by (23) would reduce to the generating 
function of the autocorrelation coefficients (cf. section 17). In this 
approach, the continuous parts of the spectrum of light would 
correspond to the continuity intervals of the generating function. 
Perhaps the scheme of linear autoregression might serve as a 
starting point for investigations on these lines, for according to 
theorem 11 this scheme presents a continuous generating function, 
and, as pointed out in section 25 (see p. 120), we may obtain any 
scheme of hidden periodicities of type (39) as a limit case. 


APPENDIX A. 


On the <» ^-method for testing goodness of tit. 

Let F{u) denote an empirical distribution function, and F{u) a. 
hypothetical distribution function to be tested. 

In the oj^-method for testing^ the goodness of fit, the integral 

(322) = f [F(u) — du 

-—00 

is evaluated and compared with its expectation.^ This is given hj 

(323) = F[u)[l-F(.u)]du = -f-, 

where n is the number of observations in the empirical material,, 
and g is the mean difference corresponding to the distribution 
function F{u) (see H. Cra^mer (1928), p. 145, and, as to the second 
formula, H. Wold (1935), p. 48). 

The ;^^-method considers the frequency distribution in a set of 
intervals. Let 

(324) % = — ^ ^2, ^ 

stand for the end points of the intervals considered, and denote^ 
the frequency in the interval {Ui, Ui^i) by 

(325) fi = Fiui) - F{ui-.i); fi = F{u^) - F{ui^i\ 

In these notations the %Hest is given by 

Ic — 1 ^=1 

and J5'[%^] = 1. 

_ _ , 00 , 

^ The N. Smirnoff (1936) modification of the w^-test, viz. co^ = f [-?’(«) 

, — 05 

— FiuTdFiu), has the heautiful property that the probability distribution of tlie^ 

test is independent of i^('M). 
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As is well known, the weak point in the x^"tnethod lies in the 
arbitrariness implied in the required grouping of frequencies. This 
defect is especially fatal in the tails of the distribution, where the 
frequencies ft are small, and the terms in are very susceptible 
to changes in the grouping. For the same reason the %^-method 
is always inapplicable if the number of observations is small. 

The above mentioned defect does not adhere to the ^--method, 
as the frequency grouping is not needed here. However, in respect 
to the a)^-method it might be maintained that the numerical evalua- 
tion of the integral (322) is toilsome. Further, if the statistical 
material is from the start given in the form of grouped frequencies 
fi, the function F{u) required for the test will be known only in 
the end points ut of the class intervals. In the following slight 
modification of the o^-test, both of these obstacles are removed. 
The modified test is defined by 

(326) 

i 

where the points ut may be taken equidistant if the material is 
ungrouped. If the material is given as grouped frequencies (325), 
the UiB in (326) must, of course, be chosen among the ih^ in (324). 

It is seen that the modified test (326) is simple to compute. 
Further, the’ arbitrariness in (326) as to the points Ui is unimportant. 
In fact, they may be taken as close as desired, and have no system- 
atic influence upon iv'^. 

The expectation of is easily derived: 

(327) E Iw^] = 11 E [F(ud - Fiu^)f =--llF M [1 - F{u,^], 

i n i 

an expression readily computed in practice. In case the points m 
are equidistant, say Ui = u + i ' formula (327) gives 

(328) E[w^] = F-jF(.u)[l-F(u)]du+--E, 

h-n —00 n 

where i? is the remainder in the EuLER-MAcLAtiRiN formula ap- 
plied. Neglecting formula (328) may be used alternatively 

with ,(327). ;/' 

lUmtration. During the years 1920 — 1934, a rural fire insurance company in 
Sweden had the following yearly claim ratios, as expi^essed in per mille: ‘210, 1*042, 
‘261, ‘467, ‘708, ‘412, ‘455, *853, ‘641, '289, ‘669, *071, ‘303, ‘097, 1‘039. The distri- 
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biition function JFiii) obtained from these figures is shown in the diagram’ below 
together with a Peaeson Type III distribution function fitted to the data, viz. 

u 

F{%i) = Uq • j x'^ • dx for > 0, Fin) = 0 for u ^ 0, 

0 

with p = 1’725 and y = 5’437. 



Fig. 18. Hypothetical distrihution fiinction F{u) (broken line) fitted to an empirical 
distribution function F(u) (unbroken line). 

For testing the goodness of fit, formulae (326) and (328) were used. It should 
be observed that the material under investigation is too small for the application of 
the ;^^-test. Observing that n = 15, the simple computations may be followed in the 
table below. 


Table 10. Application of the iv'^-test. 


u 

Fip) 

F(u) 

1-F 

\F-F\ 

u 

F{u) 

F{u) 

1-^ 

\F-F\ 

*06 

*00 

*01 

*99 

*01 

•85 

*80 

*87 

*13 

*07 

*16 

*13 

*07 

*93 

•06 

•95 

*87 

*91 

•09 

*04 

*25 

•20 

*21 

•79 

*01 

1*05 

1*00 

•94 

•06 

*06 

‘35 

*40 

•38 

•62 

*02 

1-15 

1*00 

*96 

•04 

*04 

*46 

•47 

•52 

*48 

*05 

1-25 

1*00 

•97 

•03 

•03 

*65 

•60 

•64 

*36 

•04 

1*35 

1*00 

*98 

•02 

*02 

*65 

•67 

*74 

*26 

•07 

1-45 

1*00 

•99 

•01 

*01 

•75 

•80 

*81 

•19 

*01 

1’55 

1*00 

•99 

•01 

*01 


There only remains to compute == S j 1'^— and its expectation 
•'LF{1— F). As is easily verified, = *0265, and which 

indicates a very nice fit. ^ ^ 

It should be observed that in the above illustration the reduction of F[w ] 
in accordance with the degree of freedom in n has not been taken into con- 
sideration. 


13 — 38387. B. Wold. 



APPENDIX B. 


On tlie quantitative significance of correlation 
coefficients. 

It might appear paradoxical that correlation analysis, while of 
fundamental importance in many sciences, is in others severely 
criticized and denied quantitative significance. Let ns exemplify 
this contrast. In genetics, the fundamental facts are often ex- 
pressed in terms of correlation coefficients; the correlation between 
the statures of father and son is a classical example. The correla- 
tion coefficient is here in possession of an undisputed quantitative, 
significance. On the other hand, it is frequently argued that we 
are concerned with a different, non-quantitative kind of correlation 
coefficient as often as the coefficients have been obtained from 
time series or spatial series. 

In the discussion of the significance of correlation coefficients, a 
chief argument is based on the general experience that time series 
and spatial series often give rise to nonsense correlation, or cor- 
relation coefficients from which it is difficult or impossible to draw 
quantitative conclusions. To a certain extent, this experience has 
been analysed theoretically. Thus, the effect of trends in time 
series has been studied in detail. In the terminology of the pre- 
sent volume, the trend problem in question may be outlined as 
follows. 

Let fi+i, . . ., ^t+n and fjt, f]t+h • • Vt+n represent two time se- 
ries, and let two corresponding hypothetical sets of random vari- 
ables be given by + and + rjit), where x{t) and ^(0 are 
functions, while and rj{t) are random variables (cf. (39)). When 
studying the relations between the series and fjt, it may be that 
the interest is concentrated upon the variables ^ (0 and 9 ^( 0 , thus 
upon the coefficient r[ 5 ( 0 , 97 ( 0 ] instead of r[x{i) ^(6 + 7j(f)]. 
As a rule, this happens when x{f) and are trend functions, e.g. 
parabolas, exponential or logistic curves; the relevant thing is then 
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to study the correlation between the trend deviations ^(t) and rjit). 
Otherwise expressed, an interdependence between the two series is 
produced time itself, i. e. by the common variable t in the two 
functions xit) and yit). Generally speaking*, it depiends on the na- 
ture of the phenomena described by the time series whether or not ' 
this interdependence should be removed, i. e. whether r[x(S) + 
y{t)-\-r}{t)] or r[^(0, rjiM should be preferred as a measure of the 
correlation between the two time series considered. This remark 
brings into relief the necessity both of formulating* the problems 
strictly when studying time series, and of doing this in terms of 
explicit hypothetical models. 

A second introductory reference is given to a note of V. v. 
SzELiSKi (1934). In a time series fi+n which consists of a 

sequence of frequencies it may occur, v. Szeliski says, that 
the recording has been disturbed by alternating errors^ i. e. such 
mistakes that one or more elements ai’e included in instead of 
in Considering the correlation between two time series 

and f]t, such alternating errors tend to dimmish the correlation 
coefficient. Moreover, if the time intervals are enlarged — e. g. so 
that the frequencies + '^t+u 1^+2 + fi+s, etc. are recorded instead 
of If, |f+i, etc. — the effect of the alternating errors will decrease. 

The above references to the trend problem and to the alternating 
errors show that correlation analysis is more intricate in time se- 
ries than in such classical applications as the genetic example. The 
purpose of this appendix (cf. also p. 109 f. and my preliminary 
note (1936)) is to draw attention to a circumstance which complic- 
ates the situation still further. The point in question is based on 
an elementary distinction, viz. between what I have proposed to 
term definite and statistical units. It will be found that 

a correlation index is unconditionally quantitative only when re- 
ferring to a definite unit. On the other hand, correlation indices 
which refer to modifiable units are not directly commensurable. 

In the case of modifiable units, the statistical analysis requires 
more refined methods. We advance further that modifiable units 
are very common in the cases of time series and spatial series. 
The subject being wide and uninvestigated, we shall not aim at 
j generality in our analysis, but instead examine in some detail a 
I typical case. Since the distinction between definite and modifiable 

; units is quite elementary, we must begin by analyzing some funda- 

mental statistical concepts. 
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Statistical data always have reference to a statistical population 
formed by statistical units or individuals; elementary, unprepared 
statistical data always refer to some property of the units, and 
describe quantitatively — by class-numbers or by measure-numbers — 
the individual units in respect of the property considered. Having* 
stated this, a statistical unit will be termed definite if its delimita- 
tion is by nature unique. Otherwise, i. e. if any arbitrariness 
attaches to its size, the unit will be termed modifialle. For the 
sake of concreteness, let us at once exemplify the distinction made. 

Eecurring to the correlation between the statures of father and 
son, a specific correlation coefficient may always be taken to refer 
to a statistical population formed by a group of families. The two 
properties of the statistical unit, the family, which are correlated, 
are the stature of the family father and the stature of, say, his 
eldest son. In this example, the statistical unit is definite. 

Secondly, let us examine the claim ratios considered in Appen- 
dix A. In this case, the statistical units are constituted by a 
certain rural district in Sweden as observed in separate years, the 
different years corresponding to different statistical units. The 
property examined is the claim ratio of the rural fire insurance 
company in the district considered, the claim ratio being defined 
as the quotient between the total fire claims paid in the year by 
the company, and the total sum insured. In this example, the 
statistical units are modifiable. In fact, we may modify the units 
in respect of time, and consider, say, the monthly instead of the 
yearly claim ratios. The modifiability is even double, for we can 
as well make geographical modifications, i. e. consider the claim 
ratios in a larger or smaller district. 

The modifiability implies that a change in the sizes of the units 
will give rise to a new statistical population which has a real 
meaning and is of the same type as the original population. Having 
stated this, it is evident that we can formally perform similar 
operations in a population of definite units; the difference is that 
the modification will in the latter case give rise to fictive units, 
or units of an essentially different kind than the original ones. 
In order to illustrate this, we shall consider once more the first 
example. Let it be assumed that we possess data from a number 
of Swedish families, and from equally many ISTorwegian families. 
We can then exemplify an operation corresponding to the combina- 
tion of modifiable units by arranging first the families in pairs so 
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that every pair will contain one Swedish and one iJ^orwegian family, 
and forming secondly for every snch pair the sum of the statures 
of the two fathers and the sum of the statures of the two sons. 
Formally, this operation gives rise to a statistical population 
consisting of the same number of units as each of the two original 
ones. So far there is a complete analogy with the combination of 
modifiable units. But the statistical units obtained are altogether 
fictive, being artificial » families » where e. g. the » stature of the 
father » is the sum of two real, individual statures. 

In an illustration below we shall correlate a series of yearly 
claim ratios as considered above with a time series which reflects 
the business cycle. For this purpose, we shall use the yearly number 
of business failures in the rural districts of Sweden. Evidently, 
the statistical units will be of the same type as in the second 
example, i. e. modifiable. The only difference is that we are in this 
case concerned with another property of the statistical units. 

As a final example, let us consider a time series obtained by 
regular measurements of the air pressure at a certain meteorological 
station. In this case, each observation refers to a definite time 
point and to a definite geographical spot. Accordingly, the 
statistical units are definite. On the other hand, if the observations 
were given in the form of averages — e. g. the average air pressure 
during a month, or the average air pressure in a certain district 
— then the units would obviously be modifiable. 

After these examples it is evident that as often as the statistical 
data refer to time intervals or to geographical areas, the statistical 
units are modifiable. Moreover, we have seen two type cases of 
modifiable units. In the first case the statistical data were based 
on recorded frequencies, in the second the data were averages of 
recorded measurements. 

Thus prepared, we shall next proceed to a more thorough ex- 
amination of a typical case. It wiU be found that in a careful 
analysis of statistical data referring to modifiable units, we must 
let the hypothetical schemes be influenced by the sizes of the units. 
In other words, more refined methods wiU be required than in the 
classical correlation analysis. 

In table 11, cols. (2) — (5) are given three time series of the 
type examined in Appendix A, i. e. series of claim ratios in Swedish 
rural fire insurance companies. The series in cols. (2) — (4) have 
been formed on the basis of aU companies which are active in 
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southern, central, and northern Sweden respectively, while col. (5) 
refers to the whole of Sweden. With the same geographical 
classification, cols. (6) — (9) indicate the yearly number of business 
failures; because of the lag in official registration, these data have 
been set back one year. Kg. 19 illustrates the data referring to 
northern Sweden and to the whole of Sweden respectively. 


Table 11. Fire insurance claim ratios (Voo), and business failures in 
rural Siveden 1920 — 34. 


Claim ratios in different districts. Business failures in different districts. 


Year 

South. 

Centr. 

North. 

Whole 

South. 

Centr. 

North. 

Whole 

(1) 

(2) 

(3) 

(4) 

(6) 

(6) 

(7) 

(8) 

(9) 

1920 

1-911 

•673 

•778 

•927 

379 

1 027 

1 420 

2 826 

21 

rsn 

■953 

•761 

1-012 

413 

848 

1 612 

2 873 

22 

2'216 

•592 

•615 

-880 

381 

661 

1 086 

2 128 

23 

1-716 

•455 

•651 

-758 

320 

589 

1 072 

1 981 

24 

1-660 

•556 

*494 

•702 

260 

608 

987 

1855 

25 

1-749 

•720 

•622 

•836 

225 

627 

938 

1 790 

26 

1-734 

•648 

•764 

•878 

253 

551 

881 

1 685 

27 

1-679 

•533 

•543 

•703 

316 

528 

772 

1 616 

28 

1-995 

•498 

*599 

•785 

346 

536 

824 

1 706 

29 

1-686 

*573 

*622 

•758 

283 

543 

763 

1579 

1930 

1-498 

•653 

•704 

•813 

454 

651 

916 

1 921 

31 

2-733 

•648 

*689 

•999 

513 

827 

1 093 

2 433 

32 

2-702 

•572 

*708 

•972 

346 

712 

984 

2 042 

33 

1-669 

•633 

*783 

•864 

266 

445 

591. 

1302 

34 

1.269 

•397 

*581 

•617 

170 

292 

510 

972 

According 

to the 

previous 

discussion (see 

p. 196), 

we are in 

table 


11 concerned with modifiable statistical units. For instance, the 
units referred to in col. (5) have been obtained by grouping together 
the units described in cols. (2) — (4). Of course, owing to the 
damping effect upon random fluctuations in large statistical popula- 
tions, we must expect that the combining of districts will give rise 
to smoother curves. In agreement with this argument, those curves 
in fig. 19 which refer to northern Sweden are not quite as smooth 
as those referring to the whole of Sweden. This damping effect 
of grouping is well-known, and needs no comment. What is of 
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interest for tlae moment, and tlie point of this appendix’ is the 
fact that, considering* the correlation between claim ratio and 
business failures, the grouping has a systematic tendency to increase 
this correlation. Actually, correlating the claim ratios and the 
numbers of business failures in the whole of rural Sweden, the 
resulting correlation coefficient is ‘750. For southern, central and 
northern Sweden the coefficients obtained from table 11 are '599, 
'604 and *375 respectively. 

The smoothing effect and the increase of correlation are both 



Fig. 19. Fire insurance claim ratios ^ and business failures in rural Sweden 
1920 — 34. Thick lines: Whole Sweden. Thin lines: Northern Siueden. TJ^ibroken 
lines: Claim ratios. Broken lines: Business failures. 


connected with the law of great numbers. In fact, denoting by 
It and the claim ratio and the number of business failures, and 
by {?(©} and two corresponding random processes (see section 

11) which are stationary or non- station ary, it is plausible to assume 
that each of the variables ^it) and ^(0 is composed of a systematic 
component, say 5*^(0 and respectively, which is unaffected by 
the grouping, and a random component, say a(f) and §{t) respectively, 
which is subject to the law of great numbers. Accordingly, in 
grouping the data the random components a{t) and ^[t) will 
decrease relatively to it) and On one hand, in and 
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^;{t) the trends of and ^*(0 will then become less blurred bj 
the random components a{t) and §{t). This is the smoothing effect. 
On the other hand, the tendency to an increase in the correlation 
depends on the structure of the correlation coefficient. Proceeding 
to prove this, we shall make rather simplifying assumptions. 

Retaining the notations, we have 


(329) I m = r m 4- a (e ; ? (e r (o + (^)- 

We shall assume that the processes {a(t)} and are purely 

random and independent of each other and of {^"*(0} and 
Assuming further E [a(t)] ~ Ei^it)] == 0, we get E[^{t)‘ ^(t)] — 
and 

(330) D2 © D2 ©) + 1)2 (a), i)2 © === D2 ^ j)2 


Hence 

(331) 




Em)-^(t)]-Em-E[Q 


r®®, r®] 



Let us now assume that we combine units which are homogeneous 
in respect to the properties considered. This implies that i)(a)/D©) 
and D®/D®) will vary in an inverse proportion to the size of the 
unit. Thus, these quotients will tend to zero as the unit size 
tends to infinity . Formula (331) shows that r[©6, ^(t)] in such 
a case will tend to the limit r®®, ®]. On the other hand, if 

the unit size tends to zero, Z)(a)/D®) and Z)(/?)/i)®) will tend to 
infinity. The correlation coefficient (331) will then tend to zero. 

Of course, the tendencies pointed out above will be present under 
more general assumptions concerning the components of the variables 
examined. However, the above discussion is sufficient for our 
purpose, which is to give a basis for a few general statements 
concerning correlation analysis of modifiable units. 

The situation may be described as follows. Any correlation index 
JR referring to a modifiable unit is no proper average of the corres- 
ponding indices Bi referring to special parts of the unit. Owing to 
the relatively larger dispersion of random components at reduced units, 

\ B \ is frequently larger than any |i?zl. 
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Adopting* a term proposed by C. Spearman (1907) in a related 
case (see below) we shall say that the above effect of the division 
of units is an attenuation (Verdilnmmg) of correlation. Conversely, 
the effect of combining units will be called a condensation (Verdich- 
tung) of correlation. 

We can also describe the situation by saying that ang correlation 
index referring to a modifiable unit is quaiititatively conditioned by the 
si^e of the unit referred to. In other words, correlation indices 
referring to units of different sizes are not directly comparable. For 
instance, returning to the correlation between fire claims and busi- 
ness failures, the coefficient ‘375 of northern Sweden is not directly 
comparable with the figure ‘750 of whole Sweden, nor with the 
figures ‘599 and ‘604 of southern and central Sweden respectively. 

At first glance it might seem paradoxical that a correlation 
coefficient referring to a geographic district is larger than the 
coefficients referring to the sub-districts. We have seen that 
formula (331) resolves the paradox, the explanation being that 
without further analysis a correlation coefficient referring to a 
modifiable unit can give only qualitative information as to whether 
or not a correlation is present. Having found this, our next 
question wiU be whether it is at all possible to arrive at quantita- 
tive statements about correlation when concerned with modifiable 
units. In dealing with this problem, we shall attach the analysis 
to formula (331), and try to estimate the variances of the systematic 
and the random components. We shall see that in cases such as 
the above this can be done by means of the variate difference 
method, and that we in this way can obtain a quantitative inter- 
pretation of the correlation between the series examined. 

Considering the variables (329), we get without difficulty 

(332) D" (a if)) = D" cc (t)) / . 

Moreover, if the variables it) are such that, for a certain Jc, the 
differences are approximately vanishing, we get 

(333) 

Formulae (332) and (333) form the basis of the variate difference 
method (see e. g. 0. Anderson (1929) and G. XT. Tule (1921)). 
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Calculating: for Ic = 1, 2, . . . the empirical expressions corresponding* 
to (333), viz. 

(334) 

the values obtained will, roughly speaking, in a few steps decrease to 
a limit level. This is taken as an estimate of D^(a). 

Applying formula (334) to the series given in table 11, and 
forming the square roots of the resulting values, the first three 
rows of table 12 were obtained. The columns are numbered as in 
table 11. In the fourth row are given the dispersions of the 
original series. 


Table 12. Estimates used in correlation a^ialysis of table 11. 



(2) 

(3) 

(4) 

(5) 

(6) 

(7) 

(8) 

(9) 


•3576 

*1017 

•0824 

•0869 

69 

97 

144 

255 

k=2 

•3030 

•0801 

•0778 

•0636 

76 

62 

121 

174 


•2612 

•0601 

•0750 

•0541 

39 

54 

104 

140 

3 

•4018 

•1246 

•0862 

•1118 

88 

171 

274 

494 

3* 

•3047 

•1091 

•0425 

*0978 

66 

162 

253 

474 

h 

4-83 

1-11 

1-39 

1 

*42 

•39 

*74 

1 


3-73 

1*46 

1-27 

1 

*42 

•57 

•71 

1 


In agreement with the theory of the variate difference method, 
the first three values in each column — except for the 6th — 
form a decreasing sequence. Following up the method used, let 
us fix the values obtained for = 3 as estimates of the dispersions 
of the random components; in column (6), we choose instead the 
first value, i.e. 59. 

Having fixed the estimates of the random component dispersions, 
formula (330) supplies us with corresponding estimates say, 2)*, of 
the dispersions of the systematic components. For instance, col. 
(2) gives 2)’*' = VcdOlS)^ — (*2612)^ == ‘3047. The estimates thus 
obtained are given in the fifth row of table 12. 

Observing that formula (331) may be written 

r m = C (03 . . 


(335) 
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we get a rough estimate of the left member by inserting the earlier 
obtained estimates of the quantities appearing in the right member. 
For instance, the estimates referring* to the whole of Sweden give 


*1118 494 
*0978 *474 


.(•750) = *89. 


Expressing the result in words, we have performed a hypothetical 
decomposition of the claim ratio and the number of business failures, 
so that each series is supposed to consist of a systematic and a 
random component. The correlation between the two series arises 
from the correlation between the systematic components. Estimating 
the correlation coefficient of the systematic components in the two 
series referring to the whole of Sweden, we have found '89. 

It is illuminating to compare the above application of the variate 
difference method with the well-known application in the trend 
problem touched upon in the introduction of this appendix. Keeping 
the notations, the method is in the latter case used for removing 
the functional components x(f) and the relevant problem being 
to study the correlation between the components ^ if) and t] (t). In 
our case, it is ^{t) and rjit) that are removed. 

We shall next investigate the geographic variation of r*. Ap- 
plying formula (335) to the sub-districts referred to in cols. (2) — (4), 
we get the estimates 1*07, *73 and *82 for southern, central and 
northern Sweden respectively. In agreement with the above theory, 
and in contradistinction to the original or total correlation coeffi- 
cients f, the coefficients f* referring to the whole of Sweden is a 
genuine average of the coefficients r* referring to the sub-districts. 
A short calculation shows that the arithmetical mean of the latter 
values is ‘87, a value close to the estimate == *89 based on the 
total material. 

The estimates obtained for suggest that there is some geo- 
graphic variation present in the correlation between the systematic 
components in the series under analysis. Of course, the estimates 
of give but a rough idea of this variation, the estimateshaving 
been obtained by a very crude method. This is already clear from 
the fact that for one of the districts the estimate of is above 
unity. 

Finally, we shall touch upon a method to test the estimates 
yielded by the variate difference method. Considering the random 
components, it is plausible to assume as a first approximation that 
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the variance of the random component in the claim sum and in 
the number of business failures is proportionate to the total claim 
sum and to the total number of business failures respectively (see 
e.g. H. Ceamer (1927), p. 161). For the claim ratio, we get a 
corresponding estimate by dividing the total claim sum by the 
square of the total sum insured. In this way we get an estimate 
of the relative magnitude of the dispersions of the random compon- 
ents in the different districts, and this estimate is independent 
of the variate difference method. Writing for this estimate, the 
dispersion of the random component in the series referring to the 
whole of Sweden being taken for unit, I have found for the 
values given in the last row in table 12. Writing q^ for the 
corresponding values given by the variate difference method, and 
considering e. g. col. (2), we get = *2612 / *0541 == 4*83. The 
values of q^ are likewise given in table 12. 

The above test of the estimates delivered by the variate difference 
method suggest that the estimate jD"** is too large in col. (2) and 
somewhat too small in cols. (3) and (7). If we paid regard to 
this test, we would get a larger dispersion of the systematic compon- 
ent in col. (2), and a smaller one in cols. (3) and (7). The estim- 
ate of r* yielded by (335) for southern Sweden would then 
decrease from 1*07 to a value just below unity. In the same way, 
the estimate obtained for central Sweden would come nearer to 
the value obtained for the whole of Sweden. 

Let us sum up the above analysis. We have started from the 
hypothesis that the time series examined are composed of a system- 
atic component and a random component. By hypothesis, the former 
is unaffected when modifying the statistical units, while the latter 
is subject to a relative damping according to the law of great 
numbers. This simple and plausible hypothesis is sufficient to 
explain the decrease in the correlation coefficient when passing to 
time series referring to smaller geographic districts. In fact, the 
correlation between the systematic components is the same, but 
since the division gives rise to relatively larger random components, 
the total correlation will decrease. On the other hand, the correla- 
tion coefficients between the systematic components wilj. be strictly 
quantitative. Accordingly, the coefficients referring to different 
geographical districts are directly comparable, whereas the total 
coefficients are not. Thus, in terms of the hypothetical decomposi- 
tion we can describe the correlation situation quantitatively. 
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In cases such as that investigated above, the variate difference 
method provides ns with estimates sufficient for a quantitative 
correlation analysis. As has been the principle throughout in this 
volume, the purpose of this application is not to supply a definite 
theory of the phenomena examined, but merely to illustrate in detail 
how the methods proposed will work in practice. In the present 
application, this limitation is the more self-evident, the time series 
analyzed being very short. 

The above analysis covers but a particular case of modifiable 
units. In other cases, the analysis will require other methods. 
In the above example, it was possible to apply the variate difference 
method, a circumstance due to the fact that we were concerned 
with time series. In spatial series, the modifiability gives rise to 
more intricate problems. For instance, let us form a spatial series 
by dividing Sweden into a large number of small sub-districts, and 
tabulate for each district the claim ratio of a fixed year. Forming 
a corresponding series of business failures, let us consider the 
correlation between the two spatial series obtained. The hypothe- 
sis of a systematic and a random component being plausible in 
any series referring to modifiable units, it is evident that even in 
this case the quantitative significance of the correlation coefficient 
is conditioned by the sizes of the statistical units. However, the 
different data in the same series refer to statistical data of differ- 
ent sizes. Consequently, we cannot as before use the variate 
difference method on the original series to eliminate the systematic 
component; other methods are required. 

Having pointed out the disturbing effect of the random compon- 
ent, it might be asked whether anything general can be said of 
the magnitude of the effect. Of course, the effect depends on the 
relative magnitude of the random component. With large statist- 
ical masses, the random component often must be relatively small. 
In such cases the disturbing effect might be neglected, and the 
correlation coefficients identified with the correlation coefficients 
between the systematic components. 

We shall not carry the analysis further; the purpose of this 
appendix is not to outline a correlation theory of modifiable units, 
but rather to state a program, to point out the need of definite 
hypotheses and of careful and refined methods in the analysis of 
modifiable statistical units. 
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In ' conclusion, a few references will be made to related inves- 
tigations. 

Having searched in the statistical literature for viewpoints related 
to the above analysis, I have found but one paper to quote, viz. a 
short note by G, E. G-ehlke and Katherine Biehl (1934). Follow- 
ing a suggestion of H. Sheldon, Gehlke and Biehl present 
certain correlation coefficients which support Sheldon’s experience 
that in spatial series the correlation coefficient has a tendency to 
increase as districts are combined. — The series examined by 
Gehlke and Biehl refer to modifiable units, and are of the type 
exemplified on p. 205. The authors give no theoretical analysis 
of the tendency pointed out. However, our hypothetical decomposi- 
tion into a systematic and a random component is sufficient to 
give an explanation of the tendency in question. 

Finally, it is of interest to call attention to a formal parallel to 
our analysis of modifiable units. The hypothetical decomposition 
into a systematic and a random component is regularly used in the 
theory of measurements, the accidental errors in the measurements 
being regarded as purely random. Consequently, studying the 
correlation between two phenomena, a theoretical analysis leads to 
a formula of type (331), which shows that the larger the errors, 
the smaller the observed correlation. P. C. Mahalanobis (1922) 
seems to have been the first to point out tliis disturbing eftect of 
measurement errors (see also W. Portio (1936)). — The C. Spear- 
man (1907) correction for attenuation of correlation (cf. also W. 
Brown and G. H. Thomson (1925), Chax3ter VIII, and T. L. Kelley 
(1924), Chapter IX) can be interpreted as a particular case of 
reckoning with measurement errors.^ A series of scores in a test 
of mental abilities of an individual is rather unreliable because of 
accidental circumstances. This bias can be neutralized by averaging 
the scores of several tests. In other words, a test series is consi- 
dered to be composed of a systematic and a random component, the 
latter being damped when taking' the average of different test 
series. Thus, correlating test series from two individuals, the 
correlation will be larger when using series obtained by averaging. 
The ideal case of an infinite number of averaged tests would give 
the correlation between the systematic components. This correla- 
tion can be estimated by applying the Spearman correction of 


I am indebted to Professor W. 1 . Thomas for this reference. 
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attenuation, a correction based on a formula of type (331). — 
Considering a two-dimensional random variable, say [^, ^], and 
forming a random sample of n elements, say [^n, CJ, 

S. D. WicESELL (1918) has studied tbe problem of estimating the 
correlation between ^ and ^ on tbe basis of the correlation in the 
sample. The formulae of Wicksell show, i.a., that in general the 
sampling correlation is smaller, a result which is in agreement 
i^ith the above analysis, for each sample element may be looked 
upon as composed of a systematic and a random component. 
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